Web server log files icon

Web server log files

Web server log files

Plugin: go.d.plugin Module: web_log

Overview

This collector monitors web servers by parsing their log files.

This collector is supported on all platforms.

This collector supports collecting metrics from multiple instances of this integration, including remote instances.

Default Behavior

Auto-Detection

It automatically detects log files of web servers running on localhost.

Limits

The default configuration for this integration does not impose any limits on data collection.

Performance Impact

The default configuration for this integration is not expected to impose a significant performance impact on the system.

Setup

Prerequisites

No action required.

Configuration

File

The configuration file name for this integration is go.d/web_log.conf.

You can edit the configuration file using the edit-config script from the Netdata config directory.

cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config go.d/web_log.conf

Options

Weblog is aware of how to parse and interpret the following fields (known fields):

nginx

apache

nginx apache description
$host ($http_host) %v Name of the server which accepted a request.
$server_port %p Port of the server which accepted a request.
$scheme - Request scheme. “http” or “https”.
$remote_addr %a (%h) Client address.
$request %r Full original request line. The line is “$request_method $request_uri $server_protocol”.
$request_method %m Request method. Usually “GET” or “POST”.
$request_uri %U Full original request URI.
$server_protocol %H Request protocol. Usually “HTTP/1.0”, “HTTP/1.1”, or “HTTP/2.0”.
$status %s (%>s) Response status code.
$request_length %I Bytes received from a client, including request and headers.
$bytes_sent %O Bytes sent to a client, including request and headers.
$body_bytes_sent %B (%b) Bytes sent to a client, not counting the response header.
$request_time %D Request processing time.
$upstream_response_time - Time spent on receiving the response from the upstream server.
$ssl_protocol - Protocol of an established SSL connection.
$ssl_cipher - String of ciphers used for an established SSL connection.

Notes:

  • Apache %h logs the IP address if HostnameLookups is Off. The web log collector counts hostnames as IPv4 addresses. We recommend either to disable HostnameLookups or use %a instead of %h.
  • Since httpd 2.0, unlike 1.3, the %b and %B format strings do not represent the number of bytes sent to the client, but simply the size in bytes of the HTTP response. It will differ, for instance, if the connection is aborted, or if SSL is used. The %O format provided by mod_logio will log the actual number of bytes sent over the network.
  • To get %I and %O working you need to enable mod_logio on Apache.
  • NGINX logs URI with query parameters, Apache doesnt.
  • $request is parsed into $request_method, $request_uri and $server_protocol. If you have $request in your log format, there is no sense to have others.
  • Don’t use both $bytes_sent and $body_bytes_sent (%O and %B or %b). The module does not distinguish between these parameters.
Name Description Default Required
update_every Data collection frequency. 1 no
autodetection_retry Recheck interval in seconds. Zero means no recheck will be scheduled. 0 no
path Path to the web server log file. yes
exclude_path Path to exclude. *.gz no
url_patterns List of URL patterns. [] no
url_patterns.name Used as a dimension name. yes
url_patterns.pattern Used to match against full original request URI. Pattern syntax in matcher. yes
log_type Log parser type. auto no
csv_config CSV log parser config. no
csv_config.delimiter CSV field delimiter. , no
csv_config.format CSV log format. no
ltsv_config LTSV log parser config. no
ltsv_config.field_delimiter LTSV field delimiter. \t no
ltsv_config.value_delimiter LTSV value delimiter. : no
ltsv_config.mapping LTSV fields mapping to known fields. yes
json_config JSON log parser config. no
json_config.mapping JSON fields mapping to known fields. yes
regexp_config RegExp log parser config. no
regexp_config.pattern RegExp pattern with named groups. yes
url_patterns

“URL pattern” scope metrics will be collected for each URL pattern.

Option syntax:

url_patterns:
  - name: name1
    pattern: pattern1
  - name: name2
    pattern: pattern2
log_type

Weblog supports 5 different log parsers:

Parser type Description
auto Use CSV and auto-detect format
csv A comma-separated values
json JSON
ltsv LTSV
regexp Regular expression with named groups

Syntax:

log_type: auto

If log_type parameter set to auto (which is default), weblog will try to auto-detect appropriate log parser and log format using the last line of the log file.

  • checks if format is CSV (using regexp).

  • checks if format is JSON (using regexp).

  • assumes format is CSV and tries to find appropriate CSV log format using predefined list of formats. It tries to parse the line using each of them in the following order (the first one matches is used later):

    $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time $upstream_response_time
    $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time
    $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent     $request_length $request_time $upstream_response_time
    $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent     $request_length $request_time
    $host:$server_port $remote_addr - - [$time_local] "$request" $status $body_bytes_sent
    $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time $upstream_response_time
    $remote_addr - - [$time_local] "$request" $status $body_bytes_sent - - $request_length $request_time
    $remote_addr - - [$time_local] "$request" $status $body_bytes_sent     $request_length $request_time $upstream_response_time
    $remote_addr - - [$time_local] "$request" $status $body_bytes_sent     $request_length $request_time
    $remote_addr - - [$time_local] "$request" $status $body_bytes_sent
    

    If you’re using the default Apache/NGINX log format, auto-detect will work for you. If it doesn’t work you need to set the format manually.

csv_config.format
ltsv_config.mapping

The mapping is a dictionary where the key is a field, as in logs, and the value is the corresponding known field.

Note: don’t use $ and % prefixes for mapped field names.

log_type: ltsv
ltsv_config:
  mapping:
    label1: field1
    label2: field2
json_config.mapping

The mapping is a dictionary where the key is a field, as in logs, and the value is the corresponding known field.

Note: don’t use $ and % prefixes for mapped field names.

log_type: json
json_config:
  mapping:
    label1: field1
    label2: field2
regexp_config.pattern

Use pattern with subexpressions names. These names should be known fields.

Note: don’t use $ and % prefixes for mapped field names.

Syntax:

log_type: regexp
regexp_config:
  pattern: PATTERN

Examples

There are no configuration examples.

Metrics

Metrics grouped by scope.

The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.

Per Web server log files instance

These metrics refer to the entire monitored application.

This scope has no labels.

Metrics:

Metric Dimensions Unit
web_log.requests requests requests/s
web_log.excluded_requests unmatched requests/s
web_log.type_requests success, bad, redirect, error requests/s
web_log.status_code_class_responses 1xx, 2xx, 3xx, 4xx, 5xx responses/s
web_log.status_code_class_1xx_responses a dimension per 1xx code responses/s
web_log.status_code_class_2xx_responses a dimension per 2xx code responses/s
web_log.status_code_class_3xx_responses a dimension per 3xx code responses/s
web_log.status_code_class_4xx_responses a dimension per 4xx code responses/s
web_log.status_code_class_5xx_responses a dimension per 5xx code responses/s
web_log.bandwidth received, sent kilobits/s
web_log.request_processing_time min, max, avg milliseconds
web_log.requests_processing_time_histogram a dimension per bucket requests/s
web_log.upstream_response_time min, max, avg milliseconds
web_log.upstream_responses_time_histogram a dimension per bucket requests/s
web_log.current_poll_uniq_clients ipv4, ipv6 clients
web_log.vhost_requests a dimension per vhost requests/s
web_log.port_requests a dimension per port requests/s
web_log.scheme_requests http, https requests/s
web_log.http_method_requests a dimension per HTTP method requests/s
web_log.http_version_requests a dimension per HTTP version requests/s
web_log.ip_proto_requests ipv4, ipv6 requests/s
web_log.ssl_proto_requests a dimension per SSL protocol requests/s
web_log.ssl_cipher_suite_requests a dimension per SSL cipher suite requests/s
web_log.url_pattern_requests a dimension per URL pattern requests/s
web_log.custom_field_pattern_requests a dimension per custom field pattern requests/s

Per custom time field

TBD

This scope has no labels.

Metrics:

Metric Dimensions Unit
web_log.custom_time_field_summary min, max, avg milliseconds
web_log.custom_time_field_histogram a dimension per bucket observations

Per custom numeric field

TBD

This scope has no labels.

Metrics:

Metric Dimensions Unit
web_log.custom_numeric_field_{{field_name}}_summary min, max, avg {{units}}

Per URL pattern

TBD

This scope has no labels.

Metrics:

Metric Dimensions Unit
web_log.url_pattern_status_code_responses a dimension per pattern responses/s
web_log.url_pattern_http_method_requests a dimension per HTTP method requests/s
web_log.url_pattern_bandwidth received, sent kilobits/s
web_log.url_pattern_request_processing_time min, max, avg milliseconds

Alerts

The following alerts are available:

Alert name On metric Description
web_log_1m_unmatched web_log.excluded_requests percentage of unparsed log lines over the last minute
web_log_1m_requests web_log.type_requests ratio of successful HTTP requests over the last minute (1xx, 2xx, 304, 401)
web_log_1m_redirects web_log.type_requests ratio of redirection HTTP requests over the last minute (3xx except 304)
web_log_1m_bad_requests web_log.type_requests ratio of client error HTTP requests over the last minute (4xx except 401)
web_log_1m_internal_errors web_log.type_requests ratio of server error HTTP requests over the last minute (5xx)
web_log_web_slow web_log.request_processing_time average HTTP response time over the last 1 minute
web_log_5m_requests_ratio web_log.type_requests ratio of successful HTTP requests over over the last 5 minutes, compared with the previous 5 minutes

Troubleshooting

Debug Mode

Important: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature.

To troubleshoot issues with the web_log collector, run the go.d.plugin with the debug option enabled. The output should give you clues as to why the collector isn’t working.

  • Navigate to the plugins.d directory, usually at /usr/libexec/netdata/plugins.d/. If that’s not the case on your system, open netdata.conf and look for the plugins setting under [directories].

    cd /usr/libexec/netdata/plugins.d/
    
  • Switch to the netdata user.

    sudo -u netdata -s
    
  • Run the go.d.plugin to debug the collector:

    ./go.d.plugin -d -m web_log
    

Getting Logs

If you’re encountering problems with the web_log collector, follow these steps to retrieve logs and identify potential issues:

  • Run the command specific to your system (systemd, non-systemd, or Docker container).
  • Examine the output for any warnings or error messages that might indicate issues. These messages should provide clues about the root cause of the problem.

System with systemd

Use the following command to view logs generated since the last Netdata service restart:

journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep web_log

System without systemd

Locate the collector log file, typically at /var/log/netdata/collector.log, and use grep to filter for collector’s name:

grep web_log /var/log/netdata/collector.log

Note: This method shows logs from all restarts. Focus on the latest entries for troubleshooting current issues.

Docker Container

If your Netdata runs in a Docker container named “netdata” (replace if different), use this command:

docker logs netdata 2>&1 | grep web_log

The observability platform companies need to succeed

Sign up for free

Want a personalised demo of Netdata for your use case?

Book a Demo