Elasticsearch icon

Elasticsearch

Elasticsearch

Plugin: go.d.plugin Module: elasticsearch

Overview

This collector monitors the performance and health of the Elasticsearch cluster.

It uses Cluster APIs to collect metrics.

Used endpoints:

Endpoint Description API
/ Node info
/_nodes/stats Nodes metrics Nodes stats API
/_nodes/_local/stats Local node metrics Nodes stats API
/_cluster/health Cluster health stats Cluster health API
/_cluster/stats Cluster metrics Cluster stats API

This collector is supported on all platforms.

This collector supports collecting metrics from multiple instances of this integration, including remote instances.

Default Behavior

Auto-Detection

By default, it detects instances running on localhost by attempting to connect to port 9200:

  • http://127.0.0.1:9200
  • https://127.0.0.1:9200

Limits

By default, this collector monitors only the node it is connected to. To monitor all cluster nodes, set the cluster_mode configuration option to yes.

Performance Impact

The default configuration for this integration is not expected to impose a significant performance impact on the system.

Setup

Prerequisites

No action required.

Configuration

File

The configuration file name for this integration is go.d/elasticsearch.conf.

You can edit the configuration file using the edit-config script from the Netdata config directory.

cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config go.d/elasticsearch.conf

Options

The following options can be defined globally: update_every, autodetection_retry.

Name Description Default Required
update_every Data collection frequency. 5 no
autodetection_retry Recheck interval in seconds. Zero means no recheck will be scheduled. 0 no
url Server URL. http://127.0.0.1:9200 yes
cluster_mode Controls whether to collect metrics for all nodes in the cluster or only for the local node. false no
collect_node_stats Controls whether to collect nodes metrics. true no
collect_cluster_health Controls whether to collect cluster health metrics. true no
collect_cluster_stats Controls whether to collect cluster stats metrics. true no
collect_indices_stats Controls whether to collect indices metrics. false no
timeout HTTP request timeout. 2 no
username Username for basic HTTP authentication. no
password Password for basic HTTP authentication. no
proxy_url Proxy URL. no
proxy_username Username for proxy basic HTTP authentication. no
proxy_password Password for proxy basic HTTP authentication. no
method HTTP request method. GET no
body HTTP request body. no
headers HTTP request headers. no
not_follow_redirects Redirect handling policy. Controls whether the client follows redirects. no no
tls_skip_verify Server certificate chain and hostname validation policy. Controls whether the client performs this check. no no
tls_ca Certification authority that the client uses when verifying the server’s certificates. no
tls_cert Client TLS certificate. no
tls_key Client TLS key. no

Examples

Basic single node mode

A basic example configuration.

jobs:
  - name: local
    url: http://127.0.0.1:9200

Cluster mode

Cluster mode example configuration.

jobs:
  - name: local
    url: http://127.0.0.1:9200
    cluster_mode: yes

HTTP authentication

Basic HTTP authentication.

jobs:
  - name: local
    url: http://127.0.0.1:9200
    username: username
    password: password

HTTPS with self-signed certificate

Elasticsearch with enabled HTTPS and self-signed certificate.

jobs:
  - name: local
    url: https://127.0.0.1:9200
    tls_skip_verify: yes

Multi-instance

Note: When you define multiple jobs, their names must be unique.

Collecting metrics from local and remote instances.

jobs:
  - name: local
    url: http://127.0.0.1:9200

  - name: remote
    url: http://192.0.2.1:9200

Metrics

Metrics grouped by scope.

The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.

Per node

These metrics refer to the cluster node.

Labels:

Label Description
cluster_name Name of the cluster. Based on the Cluster name setting.
node_name Human-readable identifier for the node. Based on the Node name setting.
host Network host for the node, based on the Network host setting.

Metrics:

Metric Dimensions Unit
elasticsearch.node_indices_indexing index operations/s
elasticsearch.node_indices_indexing_current index operations
elasticsearch.node_indices_indexing_time index milliseconds
elasticsearch.node_indices_search queries, fetches operations/s
elasticsearch.node_indices_search_current queries, fetches operations
elasticsearch.node_indices_search_time queries, fetches milliseconds
elasticsearch.node_indices_refresh refresh operations/s
elasticsearch.node_indices_refresh_time refresh milliseconds
elasticsearch.node_indices_flush flush operations/s
elasticsearch.node_indices_flush_time flush milliseconds
elasticsearch.node_indices_fielddata_memory_usage used bytes
elasticsearch.node_indices_fielddata_evictions evictions operations/s
elasticsearch.node_indices_segments_count segments segments
elasticsearch.node_indices_segments_memory_usage_total used bytes
elasticsearch.node_indices_segments_memory_usage terms, stored_fields, term_vectors, norms, points, doc_values, index_writer, version_map, fixed_bit_set bytes
elasticsearch.node_indices_translog_operations total, uncommitted operations
elasticsearch.node_indices_translog_size total, uncommitted bytes
elasticsearch.node_file_descriptors open fd
elasticsearch.node_jvm_heap inuse percentage
elasticsearch.node_jvm_heap_bytes committed, used bytes
elasticsearch.node_jvm_buffer_pools_count direct, mapped pools
elasticsearch.node_jvm_buffer_pool_direct_memory total, used bytes
elasticsearch.node_jvm_buffer_pool_mapped_memory total, used bytes
elasticsearch.node_jvm_gc_count young, old gc/s
elasticsearch.node_jvm_gc_time young, old milliseconds
elasticsearch.node_thread_pool_queued generic, search, search_throttled, get, analyze, write, snapshot, warmer, refresh, listener, fetch_shard_started, fetch_shard_store, flush, force_merge, management threads
elasticsearch.node_thread_pool_rejected generic, search, search_throttled, get, analyze, write, snapshot, warmer, refresh, listener, fetch_shard_started, fetch_shard_store, flush, force_merge, management threads
elasticsearch.node_cluster_communication_packets received, sent pps
elasticsearch.node_cluster_communication_traffic received, sent bytes/s
elasticsearch.node_http_connections open connections
elasticsearch.node_breakers_trips requests, fielddata, in_flight_requests, model_inference, accounting, parent trips/s

Per cluster

These metrics refer to the cluster.

Labels:

Label Description
cluster_name Name of the cluster. Based on the Cluster name setting.

Metrics:

Metric Dimensions Unit
elasticsearch.cluster_health_status green, yellow, red status
elasticsearch.cluster_number_of_nodes nodes, data_nodes nodes
elasticsearch.cluster_shards_count active_primary, active, relocating, initializing, unassigned, delayed_unaasigned shards
elasticsearch.cluster_pending_tasks pending tasks
elasticsearch.cluster_number_of_in_flight_fetch in_flight_fetch fetches
elasticsearch.cluster_indices_count indices indices
elasticsearch.cluster_indices_shards_count total, primaries, replication shards
elasticsearch.cluster_indices_docs_count docs docs
elasticsearch.cluster_indices_store_size size bytes
elasticsearch.cluster_indices_query_cache hit, miss events/s
elasticsearch.cluster_nodes_by_role_count coordinating_only, data, data_cold, data_content, data_frozen, data_hot, data_warm, ingest, master, ml, remote_cluster_client, voting_only nodes

Per index

These metrics refer to the index.

Labels:

Label Description
cluster_name Name of the cluster. Based on the Cluster name setting.
index Name of the index.

Metrics:

Metric Dimensions Unit
elasticsearch.node_index_health green, yellow, red status
elasticsearch.node_index_shards_count shards shards
elasticsearch.node_index_docs_count docs docs
elasticsearch.node_index_store_size store_size bytes

Alerts

The following alerts are available:

Alert name On metric Description
elasticsearch_node_indices_search_time_query elasticsearch.node_indices_search_time search performance is degraded, queries run slowly.
elasticsearch_node_indices_search_time_fetch elasticsearch.node_indices_search_time search performance is degraded, fetches run slowly.
elasticsearch_cluster_health_status_red elasticsearch.cluster_health_status cluster health status is red.
elasticsearch_cluster_health_status_yellow elasticsearch.cluster_health_status cluster health status is yellow.
elasticsearch_node_index_health_red elasticsearch.node_index_health node index $label:index health status is red.

Troubleshooting

Debug Mode

Important: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature.

To troubleshoot issues with the elasticsearch collector, run the go.d.plugin with the debug option enabled. The output should give you clues as to why the collector isn’t working.

  • Navigate to the plugins.d directory, usually at /usr/libexec/netdata/plugins.d/. If that’s not the case on your system, open netdata.conf and look for the plugins setting under [directories].

    cd /usr/libexec/netdata/plugins.d/
    
  • Switch to the netdata user.

    sudo -u netdata -s
    
  • Run the go.d.plugin to debug the collector:

    ./go.d.plugin -d -m elasticsearch
    

Getting Logs

If you’re encountering problems with the elasticsearch collector, follow these steps to retrieve logs and identify potential issues:

  • Run the command specific to your system (systemd, non-systemd, or Docker container).
  • Examine the output for any warnings or error messages that might indicate issues. These messages should provide clues about the root cause of the problem.

System with systemd

Use the following command to view logs generated since the last Netdata service restart:

journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep elasticsearch

System without systemd

Locate the collector log file, typically at /var/log/netdata/collector.log, and use grep to filter for collector’s name:

grep elasticsearch /var/log/netdata/collector.log

Note: This method shows logs from all restarts. Focus on the latest entries for troubleshooting current issues.

Docker Container

If your Netdata runs in a Docker container named “netdata” (replace if different), use this command:

docker logs netdata 2>&1 | grep elasticsearch

The observability platform companies need to succeed

Sign up for free

Want a personalised demo of Netdata for your use case?

Book a Demo