Elasticsearch

Plugin: go.d.plugin Module: elasticsearch

Overview

This collector monitors the performance and health of the Elasticsearch cluster.

It uses Cluster APIs to collect metrics.

Used endpoints:

Endpoint	Description	API
`/`	Node info
`/_nodes/stats`	Nodes metrics	Nodes stats API
`/_nodes/_local/stats`	Local node metrics	Nodes stats API
`/_cluster/health`	Cluster health stats	Cluster health API
`/_cluster/stats`	Cluster metrics	Cluster stats API

This collector is supported on all platforms.

This collector supports collecting metrics from multiple instances of this integration, including remote instances.

Default Behavior

Auto-Detection

By default, it detects instances running on localhost by attempting to connect to port 9200:

http://127.0.0.1:9200
https://127.0.0.1:9200

Limits

By default, this collector monitors only the node it is connected to. To monitor all cluster nodes, set the cluster_mode configuration option to yes.

Performance Impact

The default configuration for this integration is not expected to impose a significant performance impact on the system.

Setup

Prerequisites

No action required.

Configuration

File

The configuration file name for this integration is go.d/elasticsearch.conf.

You can edit the configuration file using the edit-config script from the Netdata config directory.

cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config go.d/elasticsearch.conf

Options

The following options can be defined globally: update_every, autodetection_retry.

Name	Description	Default	Required
update_every	Data collection frequency.	5	no
autodetection_retry	Recheck interval in seconds. Zero means no recheck will be scheduled.	0	no
url	Server URL.	http://127.0.0.1:9200	yes
cluster_mode	Controls whether to collect metrics for all nodes in the cluster or only for the local node.	false	no
collect_node_stats	Controls whether to collect nodes metrics.	true	no
collect_cluster_health	Controls whether to collect cluster health metrics.	true	no
collect_cluster_stats	Controls whether to collect cluster stats metrics.	true	no
collect_indices_stats	Controls whether to collect indices metrics.	false	no
timeout	HTTP request timeout.	2	no
username	Username for basic HTTP authentication.		no
password	Password for basic HTTP authentication.		no
proxy_url	Proxy URL.		no
proxy_username	Username for proxy basic HTTP authentication.		no
proxy_password	Password for proxy basic HTTP authentication.		no
method	HTTP request method.	GET	no
body	HTTP request body.		no
headers	HTTP request headers.		no
not_follow_redirects	Redirect handling policy. Controls whether the client follows redirects.	no	no
tls_skip_verify	Server certificate chain and hostname validation policy. Controls whether the client performs this check.	no	no
tls_ca	Certification authority that the client uses when verifying the server’s certificates.		no
tls_cert	Client TLS certificate.		no
tls_key	Client TLS key.		no

Examples

Basic single node mode

A basic example configuration.

jobs:
  - name: local
    url: http://127.0.0.1:9200

Cluster mode

Cluster mode example configuration.

jobs:
  - name: local
    url: http://127.0.0.1:9200
    cluster_mode: yes

HTTP authentication

Basic HTTP authentication.

jobs:
  - name: local
    url: http://127.0.0.1:9200
    username: username
    password: password

HTTPS with self-signed certificate

Elasticsearch with enabled HTTPS and self-signed certificate.

jobs:
  - name: local
    url: https://127.0.0.1:9200
    tls_skip_verify: yes

Multi-instance

Note: When you define multiple jobs, their names must be unique.

Collecting metrics from local and remote instances.

jobs:
  - name: local
    url: http://127.0.0.1:9200

  - name: remote
    url: http://192.0.2.1:9200

Metrics

Metrics grouped by scope.

The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.

Per node

These metrics refer to the cluster node.

Labels:

Label	Description
cluster_name	Name of the cluster. Based on the Cluster name setting.
node_name	Human-readable identifier for the node. Based on the Node name setting.
host	Network host for the node, based on the Network host setting.

Metrics:

Metric	Dimensions	Unit
elasticsearch.node_indices_indexing	index	operations/s
elasticsearch.node_indices_indexing_current	index	operations
elasticsearch.node_indices_indexing_time	index	milliseconds
elasticsearch.node_indices_search	queries, fetches	operations/s
elasticsearch.node_indices_search_current	queries, fetches	operations
elasticsearch.node_indices_search_time	queries, fetches	milliseconds
elasticsearch.node_indices_refresh	refresh	operations/s
elasticsearch.node_indices_refresh_time	refresh	milliseconds
elasticsearch.node_indices_flush	flush	operations/s
elasticsearch.node_indices_flush_time	flush	milliseconds
elasticsearch.node_indices_fielddata_memory_usage	used	bytes
elasticsearch.node_indices_fielddata_evictions	evictions	operations/s
elasticsearch.node_indices_segments_count	segments	segments
elasticsearch.node_indices_segments_memory_usage_total	used	bytes
elasticsearch.node_indices_segments_memory_usage	terms, stored_fields, term_vectors, norms, points, doc_values, index_writer, version_map, fixed_bit_set	bytes
elasticsearch.node_indices_translog_operations	total, uncommitted	operations
elasticsearch.node_indices_translog_size	total, uncommitted	bytes
elasticsearch.node_file_descriptors	open	fd
elasticsearch.node_jvm_heap	inuse	percentage
elasticsearch.node_jvm_heap_bytes	committed, used	bytes
elasticsearch.node_jvm_buffer_pools_count	direct, mapped	pools
elasticsearch.node_jvm_buffer_pool_direct_memory	total, used	bytes
elasticsearch.node_jvm_buffer_pool_mapped_memory	total, used	bytes
elasticsearch.node_jvm_gc_count	young, old	gc/s
elasticsearch.node_jvm_gc_time	young, old	milliseconds
elasticsearch.node_thread_pool_queued	generic, search, search_throttled, get, analyze, write, snapshot, warmer, refresh, listener, fetch_shard_started, fetch_shard_store, flush, force_merge, management	threads
elasticsearch.node_thread_pool_rejected	generic, search, search_throttled, get, analyze, write, snapshot, warmer, refresh, listener, fetch_shard_started, fetch_shard_store, flush, force_merge, management	threads
elasticsearch.node_cluster_communication_packets	received, sent	pps
elasticsearch.node_cluster_communication_traffic	received, sent	bytes/s
elasticsearch.node_http_connections	open	connections
elasticsearch.node_breakers_trips	requests, fielddata, in_flight_requests, model_inference, accounting, parent	trips/s

Per cluster

These metrics refer to the cluster.

Labels:

Label	Description
cluster_name	Name of the cluster. Based on the Cluster name setting.

Metrics:

Metric	Dimensions	Unit
elasticsearch.cluster_health_status	green, yellow, red	status
elasticsearch.cluster_number_of_nodes	nodes, data_nodes	nodes
elasticsearch.cluster_shards_count	active_primary, active, relocating, initializing, unassigned, delayed_unaasigned	shards
elasticsearch.cluster_pending_tasks	pending	tasks
elasticsearch.cluster_number_of_in_flight_fetch	in_flight_fetch	fetches
elasticsearch.cluster_indices_count	indices	indices
elasticsearch.cluster_indices_shards_count	total, primaries, replication	shards
elasticsearch.cluster_indices_docs_count	docs	docs
elasticsearch.cluster_indices_store_size	size	bytes
elasticsearch.cluster_indices_query_cache	hit, miss	events/s
elasticsearch.cluster_nodes_by_role_count	coordinating_only, data, data_cold, data_content, data_frozen, data_hot, data_warm, ingest, master, ml, remote_cluster_client, voting_only	nodes

Per index

These metrics refer to the index.

Labels:

Label	Description
cluster_name	Name of the cluster. Based on the Cluster name setting.
index	Name of the index.

Metrics:

Metric	Dimensions	Unit
elasticsearch.node_index_health	green, yellow, red	status
elasticsearch.node_index_shards_count	shards	shards
elasticsearch.node_index_docs_count	docs	docs
elasticsearch.node_index_store_size	store_size	bytes

Alerts

The following alerts are available:

Alert name	On metric	Description
elasticsearch_node_indices_search_time_query	elasticsearch.node_indices_search_time	search performance is degraded, queries run slowly.
elasticsearch_node_indices_search_time_fetch	elasticsearch.node_indices_search_time	search performance is degraded, fetches run slowly.
elasticsearch_cluster_health_status_red	elasticsearch.cluster_health_status	cluster health status is red.
elasticsearch_cluster_health_status_yellow	elasticsearch.cluster_health_status	cluster health status is yellow.
elasticsearch_node_index_health_red	elasticsearch.node_index_health	node index $label:index health status is red.

Troubleshooting

Debug Mode

Important: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature.

To troubleshoot issues with the elasticsearch collector, run the go.d.plugin with the debug option enabled. The output should give you clues as to why the collector isn’t working.

Navigate to the plugins.d directory, usually at /usr/libexec/netdata/plugins.d/. If that’s not the case on your system, open netdata.conf and look for the plugins setting under [directories].
```
cd /usr/libexec/netdata/plugins.d/
```
Switch to the netdata user.
```
sudo -u netdata -s
```
Run the go.d.plugin to debug the collector:
```
./go.d.plugin -d -m elasticsearch
```

Getting Logs

If you’re encountering problems with the elasticsearch collector, follow these steps to retrieve logs and identify potential issues:

Run the command specific to your system (systemd, non-systemd, or Docker container).
Examine the output for any warnings or error messages that might indicate issues. These messages should provide clues about the root cause of the problem.

System with systemd

Use the following command to view logs generated since the last Netdata service restart:

journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep elasticsearch

System without systemd

Locate the collector log file, typically at /var/log/netdata/collector.log, and use grep to filter for collector’s name:

grep elasticsearch /var/log/netdata/collector.log

Note: This method shows logs from all restarts. Focus on the latest entries for troubleshooting current issues.

Docker Container

If your Netdata runs in a Docker container named “netdata” (replace if different), use this command:

docker logs netdata 2>&1 | grep elasticsearch

Industry

Technology

Use cases

Elasticsearch

Elasticsearch

Overview

Default Behavior

Auto-Detection

Limits

Performance Impact

Setup

Prerequisites

Configuration

File

Options

Examples

Basic single node mode

Cluster mode

HTTP authentication

HTTPS with self-signed certificate

Multi-instance

Metrics

Per node

Per cluster

Per index

Alerts

Troubleshooting

Debug Mode

Getting Logs

System with systemd

System without systemd

Docker Container

The observability platform companies need to succeed