RiakKV icon

RiakKV

RiakKV

Plugin: python.d.plugin Module: riakkv

Overview

This collector monitors RiakKV metrics about throughput, latency, resources and more.'

This collector reads the database stats from the /stats endpoint.

This collector is supported on all platforms.

This collector supports collecting metrics from multiple instances of this integration, including remote instances.

Default Behavior

Auto-Detection

If the /stats endpoint is accessible, RiakKV instances on the local host running on port 8098 will be autodetected.

Limits

The default configuration for this integration does not impose any limits on data collection.

Performance Impact

The default configuration for this integration is not expected to impose a significant performance impact on the system.

Setup

Prerequisites

Configure RiakKV to enable /stats endpoint

You can follow the RiakKV configuration reference documentation for how to enable this.

Source : https://docs.riak.com/riak/kv/2.2.3/configuring/reference/#client-interfaces

Configuration

File

The configuration file name for this integration is python.d/riakkv.conf.

You can edit the configuration file using the edit-config script from the Netdata config directory.

cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config python.d/riakkv.conf

Options

There are 2 sections:

  • Global variables
  • One or more JOBS that can define multiple different instances to monitor.

The following options can be defined globally: priority, penalty, autodetection_retry, update_every, but can also be defined per JOB to override the global values.

Additionally, the following collapsed table contains all the options that can be configured inside a JOB definition.

Every configuration JOB starts with a job_name value which will appear in the dashboard, unless a name parameter is specified.

Name Description Default Required
update_every Sets the default data collection frequency. 5 False
priority Controls the order of charts at the netdata dashboard. 60000 False
autodetection_retry Sets the job re-check interval in seconds. 0 False
penalty Indicates whether to apply penalty to update_every in case of failures. yes False
url The url of the server no True

Examples

Basic (default)

A basic example configuration per job

local:
url: 'http://localhost:8098/stats'

Multi-instance

Note: When you define multiple jobs, their names must be unique.

Collecting metrics from local and remote instances.

local:
  url: 'http://localhost:8098/stats'

remote:
  url: 'http://192.0.2.1:8098/stats'

Metrics

Metrics grouped by scope.

The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.

Per RiakKV instance

These metrics refer to the entire monitored application.

This scope has no labels.

Metrics:

Metric Dimensions Unit
riak.kv.throughput gets, puts operations/s
riak.dt.vnode_updates counters, sets, maps operations/s
riak.search queries queries/s
riak.search.documents indexed documents/s
riak.consistent.operations gets, puts operations/s
riak.kv.latency.get mean, median, 95, 99, 100 ms
riak.kv.latency.put mean, median, 95, 99, 100 ms
riak.dt.latency.counter_merge mean, median, 95, 99, 100 ms
riak.dt.latency.set_merge mean, median, 95, 99, 100 ms
riak.dt.latency.map_merge mean, median, 95, 99, 100 ms
riak.search.latency.query median, min, 95, 99, 999, max ms
riak.search.latency.index median, min, 95, 99, 999, max ms
riak.consistent.latency.get mean, median, 95, 99, 100 ms
riak.consistent.latency.put mean, median, 95, 99, 100 ms
riak.vm processes total
riak.vm.memory.processes allocated, used MB
riak.kv.siblings_encountered.get mean, median, 95, 99, 100 siblings
riak.kv.objsize.get mean, median, 95, 99, 100 KB
riak.search.vnodeq_size mean, median, 95, 99, 100 messages
riak.search.index errors errors
riak.core.protobuf_connections active connections
riak.core.repairs read repairs
riak.core.fsm_active get, put, secondary index, list keys fsms
riak.core.fsm_rejected get, put fsms
riak.search.index bad_entry, extract_fail writes

Alerts

The following alerts are available:

Alert name On metric Description
riakkv_1h_kv_get_mean_latency riak.kv.latency.get average time between reception of client GET request and subsequent response to client over the last hour
riakkv_kv_get_slow riak.kv.latency.get average time between reception of client GET request and subsequent response to the client over the last 3 minutes, compared to the average over the last hour
riakkv_1h_kv_put_mean_latency riak.kv.latency.put average time between reception of client PUT request and subsequent response to the client over the last hour
riakkv_kv_put_slow riak.kv.latency.put average time between reception of client PUT request and subsequent response to the client over the last 3 minutes, compared to the average over the last hour
riakkv_vm_high_process_count riak.vm number of processes running in the Erlang VM
riakkv_list_keys_active riak.core.fsm_active number of currently running list keys finite state machines

Troubleshooting

Debug Mode

To troubleshoot issues with the riakkv collector, run the python.d.plugin with the debug option enabled. The output should give you clues as to why the collector isn’t working.

  • Navigate to the plugins.d directory, usually at /usr/libexec/netdata/plugins.d/. If that’s not the case on your system, open netdata.conf and look for the plugins setting under [directories].

    cd /usr/libexec/netdata/plugins.d/
    
  • Switch to the netdata user.

    sudo -u netdata -s
    
  • Run the python.d.plugin to debug the collector:

    ./python.d.plugin riakkv debug trace
    

Get Netdata

Sign up for free

Want to see a demonstration of Netdata for multiple use cases?

Go to Live Demo