ZooKeeper monitoring with Netdata

What is ZooKeeper?

Apache ZooKeeper is an open-source distributed coordination service for distributed applications. It provides a reliable, centralized service for maintaining configuration information, naming, and providing distributed synchronization across the cluster. It is used in many distributed computing applications to track changes in configuration and state.

Monitoring ZooKeeper with Netdata

The prerequisites for monitoring ZooKeeper with Netdata are to have ZooKeeper and Netdata installed on your system.

Netdata auto discovers hundreds of services, and for those it doesn’t turning on manual discovery is a one line configuration. For more information on configuring Netdata for ZooKeeper monitoring please read the collector documentation.

You should now see the ZooKeeper section on the Overview tab in Netdata Cloud already populated with charts about all the metrics you care about.

Netdata has a public demo space (no login required) where you can explore different monitoring use-cases and get a feel for Netdata.

What ZooKeeper metrics are important to monitor - and why?

Requests

This metric shows the number of requests that are outstanding (i.e. have not yet been responded to) by the server. Monitoring this metric can help identify issues with the server being overloaded and can help prevent performance degradation or system outages.

Requests Latency

This metric shows the time taken for a request to be responded to by the server, measured in milliseconds. Monitoring this metric can help identify performance bottlenecks and unresponsiveness. A normal range for this metric would be from 0 - 500 milliseconds.

Connections

This metric shows the number of active connections currently made to the server. Monitoring this metric can help identify issues with too many connections being made to the server, which can lead to system slowdowns or outages.

Packets

This metric shows the number of packets (both sent and received) by the server. Monitoring this metric can help identify potential network issues or misconfiguration.

File Descriptors

This metric shows the number of open file descriptors associated with the server. Monitoring this metric can help identify potential resource exhaustion, which could lead to system slowdowns or outages.

Nodes

This metric shows the number of znodes (persistent nodes) and ephemerals (temporary nodes) stored by the server. Monitoring this metric can help identify potential memory issues, as a large number of nodes can cause memory pressure.

Watches

This metric shows the number of active watches on the server. Monitoring this metric can help identify potential performance issues, as a large number of watches can cause a performance bottleneck.

Approximate Data Size

This metric shows the approximate size of the data stored on the server. Monitoring this metric can help identify potential issues with the size of the data stored on the server, which can lead to performance slowdowns.

Server State

This metric shows the current state of the server. Monitoring this metric can help identify potential issues with the server, such as slowdowns or outages.

Get Netdata

Sign up for free

Want to see a demonstration of Netdata for multiple use cases?

Go to Live Demo