CEPH monitoring with Netdata

What is CEPH?

Ceph is a distributed storage system that provides unified object, block, and file system storage. It is designed to provide highly scalable, fault-tolerant, and self-managing storage with features such as replication, erasure coding, and automated data placement. Ceph is widely used for cloud storage solutions and for applications requiring massive scalability and high performance.

Monitoring CEPH with Netdata

The prerequisites for monitoring CEPH with Netdata are to have CEPH and Netdata installed on your system.

Netdata auto discovers hundreds of services, and for those it doesn’t turning on manual discovery is a one line configuration. For more information on configuring Netdata for CEPH monitoring please read the collector documentation.

You should now see the CEPH section on the Overview tab in Netdata Cloud already populated with charts about all the metrics you care about.

Netdata has a public demo space (no login required) where you can explore different monitoring use-cases and get a feel for Netdata.

What CEPH metrics are important to monitor - and why?

general_usage

This metric has a built-in alert. See below to learn more.


 template: ceph_cluster_space_usage
       on: ceph.general_usage
    class: Utilization
     type: Storage
component: Ceph
     calc: $used * 100 / ($used + $avail)
    units: %
    every: 1m
     warn: $this > (($status >= $WARNING ) ? (85) : (90))
     crit: $this > (($status == $CRITICAL) ? (90) : (98))
    delay: down 5m multiplier 1.2 max 1h
     info: cluster disk space utilization
       to: sysadmin

General Objects

General Bytes

General Operations

General Latency

Pool Usage

Pool Objects

Pool Read Bytes

Pool Write Bytes

Pool Read Operations

Pool Write Operations

OSD Usage

OSD Size

OSD Apply Latency

Get Netdata

Sign up for free

Want to see a demonstration of Netdata for multiple use cases?

Go to Live Demo