Fluentd monitoring with Netdata

What is Fluentd?

Fluentd is an open source data collector for log-based data streams. It is designed to unify logging infrastructure across different sources, such as application logs, databases, and system logs. By using Fluentd, you can easily aggregate, parse, and transfer logs to a variety of destinations, such as Elasticsearch, Amazon S3, Redis, and MongoDB.

Monitoring Fluentd with Netdata

The prerequisites for monitoring Fluentd with Netdata are to have Fluentd and Netdata installed on your system.

Netdata auto discovers hundreds of services, and for those it doesn’t turning on manual discovery is a one line configuration. For more information on configuring Netdata for Fluentd monitoring please read the collector documentation.

You should now see the Fluentd section on the Overview tab in Netdata Cloud already populated with charts about all the metrics you care about.

Netdata has a public demo space (no login required) where you can explore different monitoring use-cases and get a feel for Netdata.

What Fluentd metrics are important to monitor - and why?

Retry Count

Retry count is the number of times Fluentd has attempted to send a log message to its destination, but failed. It is a metric that needs to be monitored because if it is too high, it could be an indication that Fluentd is not able to send its messages in a timely manner.

An abnormally high retry count can be indicative of a variety of issues, such as network congestion, a misconfigured destination endpoint, or a server-side issue that is preventing log messages from being sent. By monitoring the retry count, one can quickly identify and address any potential issues before they become too severe.

Buffer Queue Length

Buffer queue length is the number of log messages that are currently waiting to be sent from Fluentd. This metric should be monitored to ensure that Fluentd is sending its log messages in a timely manner. If the queue length is too high, it could indicate that Fluentd is not able to keep up with the rate at which log messages are being sent.

A high buffer queue length could be caused by a variety of factors, such as an overloaded network, a misconfigured destination endpoint, or an unusually large number of log messages being sent. By monitoring this metric, one can quickly identify and address any potential issues before they become too severe.

Buffer Total Queued Size

Buffer total queued size is the total size of all log messages that are currently waiting to be sent from Fluentd. This metric should be monitored to ensure that Fluentd is not sending too large of log messages or too many log messages at once.

A high buffer total queued size could be indicative of an unusually large number of log messages being sent, or log messages that are too large for Fluentd to handle. By monitoring this metric, one can quickly identify and address any potential issues before they become too severe.

The observability platform companies need to succeed

Sign up for free

Want a personalised demo of Netdata for your use case?

Book a Demo