S.M.A.R.T.

Plugin: go.d.plugin Module: smartctl

Overview

This collector monitors the health status of storage devices by analyzing S.M.A.R.T. (Self-Monitoring, Analysis, and Reporting Technology) counters. It relies on the smartctl CLI tool but avoids directly executing the binary. Instead, it utilizes ndsudo, a Netdata helper specifically designed to run privileged commands securely within the Netdata environment. This approach eliminates the need to use sudo, improving security and potentially simplifying permission management.

Executed commands:

smartctl --json --scan
smartctl --json --all {deviceName} --device {deviceType} --nocheck {powerMode}

This collector is only supported on the following platforms:

Linux
BSD

This collector only supports collecting metrics from a single instance of this integration.

Default Behavior

Auto-Detection

This integration doesn’t support auto-detection.

Limits

The default configuration for this integration does not impose any limits on data collection.

Performance Impact

The default configuration for this integration is not expected to impose a significant performance impact on the system.

Setup

Prerequisites

Install smartmontools (v7.0+)

Install smartmontools version 7.0 or later using your distribution’s package manager. Version 7.0 introduced the --json output mode, which is required for this collector to function properly.

For Netdata running in a Docker container

Install smartmontools.

Ensure smartctl is available in the container by setting the environment variable NETDATA_EXTRA_DEB_PACKAGES=smartmontools when starting the container.
Provide access to storage devices.

Netdata requires the SYS_RAWIO capability and access to the storage devices to run the smartctl collector inside a Docker container. Here’s how you can achieve this:
- docker run
```
docker run --cap-add SYS_RAWIO --device /dev/sda:/dev/sda ...
```
- docker-compose.yml
```
services:
  netdata:
    cap_add:
      - SYS_PTRACE
      - SYS_ADMIN
      - SYS_RAWIO # smartctl
    devices:
      - "/dev/sda:/dev/sda"
```
Multiple Devices: These examples only show mapping of one device (/dev/sda). You’ll need to add additional --device options (in docker run) or entries in the devices list (in docker-compose.yml) for each storage device you want Netdata’s smartctl collector to monitor.

NVMe Devices: Do not map NVMe devices using this method. Netdata uses a dedicated collector to monitor NVMe devices.

Configuration

File

The configuration file name for this integration is go.d/smartctl.conf.

You can edit the configuration file using the edit-config script from the Netdata config directory.

cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config go.d/smartctl.conf

Options

The following options can be defined globally: update_every.

Name	Description	Default	Required
update_every	interval for updating Netdata charts, measured in seconds. Collector might use cached data if less than Devices poll interval.	10	no
timeout	smartctl binary execution timeout.	5	no
scan_every	interval for discovering new devices using `smartctl --scan`, measured in seconds. Set to 0 to scan devices only once on startup.	900	no
poll_devices_every	interval for gathering data for every device, measured in seconds. Data is cached for this interval.	300	no
device_selector	Specifies a pattern to match the ‘info name’ of devices as reported by `smartctl --scan --json`.	*	no
extra_devices	Allows manual specification of devices not automatically detected by `smartctl --scan`. Each device entry must include both a name and a type. See “Configuration Examples” for details.	[]	no
no_check_power_mode	Skip data collection when the device is in a low-power mode. Prevents unnecessary disk spin-up.	standby	no

no_check_power_mode

The valid arguments to this option are:

Mode	Description
never	Check the device always.
sleep	Check the device unless it is in SLEEP mode.
standby	Check the device unless it is in SLEEP or STANDBY mode. In these modes most disks are not spinning, so if you want to prevent a disk from spinning up, this is probably what you want.
idle	Check the device unless it is in SLEEP, STANDBY or IDLE mode. In the IDLE state, most disks are still spinning, so this is probably not what you want.

Examples

Custom devices poll interval

Allows you to override the default devices poll interval (data collection).

jobs:
  - name: smartctl
    devices_poll_interval: 60  # Collect S.M.A.R.T statistics every 60 seconds

Extra devices

This example demonstrates using extra_devices to manually add a storage device (/dev/sdc) not automatically detected by smartctl --scan.

jobs:
  - name: smartctl
    extra_devices:
      - name: /dev/sdc
        type: jmb39x-q,3

Metrics

Metrics grouped by scope.

The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.

Per controller

These metrics refer to the Storage Device.

Labels:

Label	Description
device_name	Device name
device_type	Device type
model_name	Model name
serial_number	Serial number

Metrics:

Metric	Dimensions	Unit
smartctl.device_smart_status	passed, failed	status
smartctl.device_ata_smart_error_log_count	error_log	logs
smartctl.device_power_on_time	power_on_time	seconds
smartctl.device_temperature	temperature	Celsius
smartctl.device_power_cycles_count	power	cycles
smartctl.device_read_errors_rate	corrected, uncorrected	errors/s
smartctl.device_write_errors_rate	corrected, uncorrected	errors/s
smartctl.device_verify_errors_rate	corrected, uncorrected	errors/s
smartctl.device_smart_attr_{attribute_name}	{attribute_name}	{attribute_unit}
smartctl.device_smart_attr_{attribute_name}_normalized	{attribute_name}	value

Alerts

There are no alerts configured by default for this integration.

Troubleshooting

Debug Mode

Important: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature.

To troubleshoot issues with the smartctl collector, run the go.d.plugin with the debug option enabled. The output should give you clues as to why the collector isn’t working.

Navigate to the plugins.d directory, usually at /usr/libexec/netdata/plugins.d/. If that’s not the case on your system, open netdata.conf and look for the plugins setting under [directories].
```
cd /usr/libexec/netdata/plugins.d/
```
Switch to the netdata user.
```
sudo -u netdata -s
```
Run the go.d.plugin to debug the collector:
```
./go.d.plugin -d -m smartctl
```

Getting Logs

If you’re encountering problems with the smartctl collector, follow these steps to retrieve logs and identify potential issues:

Run the command specific to your system (systemd, non-systemd, or Docker container).
Examine the output for any warnings or error messages that might indicate issues. These messages should provide clues about the root cause of the problem.

System with systemd

Use the following command to view logs generated since the last Netdata service restart:

journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep smartctl

System without systemd

Locate the collector log file, typically at /var/log/netdata/collector.log, and use grep to filter for collector’s name:

grep smartctl /var/log/netdata/collector.log

Note: This method shows logs from all restarts. Focus on the latest entries for troubleshooting current issues.

Docker Container

If your Netdata runs in a Docker container named “netdata” (replace if different), use this command:

docker logs netdata 2>&1 | grep smartctl

Industry

Technology

Use cases

S.M.A.R.T.

S.M.A.R.T.

Overview

Default Behavior

Auto-Detection

Limits

Performance Impact

Setup

Prerequisites

Install smartmontools (v7.0+)

For Netdata running in a Docker container

Configuration

File

Options

no_check_power_mode

Examples

Custom devices poll interval

Extra devices

Metrics

Per controller

Alerts

Troubleshooting

Debug Mode

Getting Logs

System with systemd

System without systemd

Docker Container

The observability platform companies need to succeed