NGINX monitoring with Netdata

What is NGINX?

NGINX (pronounced “engine X”) is a popular HTTP server and reverse proxy server. As an HTTP server, NGINX serves static content very efficiently and reliably, using relatively little memory. As a reverse proxy, it can be used as a single, controlled point of access for multiple back-end servers or for additional applications such as caching and load balancing. NGINX is available as a free, open source product or in a more full-featured, commercially distributed version called NGINX Plus.

NGINX can also be used as a mail proxy and a generic TCP proxy, but this article addresses NGINX monitoring only as a web server.

Netdata has a public loginless demo space where you can explore different monitoring use-cases. Check out the NGINX demo room to explore and interact with the charts and metrics described here.

Monitoring NGINX with Netdata

The prerequisite for monitoring NGINX with Netdata is to have one or more NGINX web servers configured with [ngx_http_stub_status_module]. The only configuration needed is to define the url to the server’s stub_status in the go.d/nginx.conf file. For more details take a look at the NGINX documentation.

jobs:
  - name: local
    url: http://127.0.0.1/stub_status
  - name: remote
    url: http://203.0.113.10/stub_status

Once this is done, you should see the NGINX section on your Overview screen which will present the metrics listed below instantly.

What to monitor on NGINX web servers?

By monitoring NGINX you can catch two categories of issues: resource issues within NGINX itself, and also problems developing elsewhere in your web infrastructure. Some of the metrics most NGINX users will benefit from monitoring include

NGINX Activity metrics

Irrespective of the NGINX use case, you will always need to monitor how many client requests your servers are receiving and how those requests are being processed. The connection requests can be monitored in multiple sub-categories:

You can additionally group these metrics by instance to see how many connections are being handled per instance and in which state they are. Connection Status per Instance

You can also setup a custom alert to report any dropped connections on your NGINX. For example, the alert shown below will raise a Warning alert if there are 10-20% dropped connections and a Critical alert when the dropped connections go beyond 20%.

  template: NGINX_Dropped_Connections_Exceeded
        on: nginx.connections_accepted_handled
     class: Utilization
      type: NGINX
 component: NGINX
      calc: (($accepted - $handled) * 100) / ($accepted)
     every: 1m
     units: %
      warn: $this > (($status >= $WARNING)  ? (10) : (20))
      crit: $this > (($status == $CRITICAL) ? (20) : (30))
     delay: down 15m multiplier 1.5 max 1h
      info: The NGINX web server has exceeded the limit of dropped connections

NGINX Error Metrics

NGINX error metrics tell you how often your servers are returning errors instead of producing useful work. Client errors are represented by 4xx status codes, server errors with 5xx status codes.

NGINX Performance Metrics

The request time metric logged by NGINX records the processing time for each request, from the reading of the first client bytes to fulfilling the request. Long response times can point to problems upstream.

Note: Although open source NGINX does not make error and performance metrics immediately available for monitoring, you can configure NGINX’s log module to write response codes and request processing times in the access log. More details on this are available on a related blog How to monitor web servers and their performance?

The observability platform companies need to succeed

Sign up for free

Want a personalised demo of Netdata for your use case?

Book a Demo