Kubernetes Cluster State icon

Kubernetes Cluster State

Kubernetes Cluster State

Plugin: go.d.plugin Module: k8s_state

Overview

This collector monitors Kubernetes Nodes, Pods and Containers.

This collector is supported on all platforms.

This collector only supports collecting metrics from a single instance of this integration.

Default Behavior

Auto-Detection

This integration doesn’t support auto-detection.

Limits

The default configuration for this integration does not impose any limits on data collection.

Performance Impact

The default configuration for this integration is not expected to impose a significant performance impact on the system.

Setup

Prerequisites

No action required.

Configuration

File

The configuration file name for this integration is go.d/k8s_state.conf.

You can edit the configuration file using the edit-config script from the Netdata config directory.

cd /etc/netdata 2>/dev/null || cd /opt/netdata/etc/netdata
sudo ./edit-config go.d/k8s_state.conf

Options

There are no configuration options.

Examples

There are no configuration examples.

Metrics

Metrics grouped by scope.

The scope defines the instance that the metric belongs to. An instance is uniquely identified by a set of labels.

Per node

These metrics refer to the Node.

Labels:

Label Description
k8s_cluster_id Cluster ID. This is equal to the kube-system namespace UID.
k8s_cluster_name Cluster name. Cluster name discovery only works in GKE.
k8s_node_name Node name.

Metrics:

Metric Dimensions Unit
k8s_state.node_allocatable_cpu_requests_utilization requests %
k8s_state.node_allocatable_cpu_requests_used requests millicpu
k8s_state.node_allocatable_cpu_limits_utilization limits %
k8s_state.node_allocatable_cpu_limits_used limits millicpu
k8s_state.node_allocatable_mem_requests_utilization requests %
k8s_state.node_allocatable_mem_requests_used requests bytes
k8s_state.node_allocatable_mem_limits_utilization limits %
k8s_state.node_allocatable_mem_limits_used limits bytes
k8s_state.node_allocatable_pods_utilization allocated %
k8s_state.node_allocatable_pods_usage available, allocated pods
k8s_state.node_condition Ready, DiskPressure, MemoryPressure, NetworkUnavailable, PIDPressure status
k8s_state.node_schedulability schedulable, unschedulable state
k8s_state.node_pods_readiness ready %
k8s_state.node_pods_readiness_state ready, unready pods
k8s_state.node_pods_condition pod_ready, pod_scheduled, pod_initialized, containers_ready pods
k8s_state.node_pods_phase running, failed, succeeded, pending pods
k8s_state.node_containers containers, init_containers containers
k8s_state.node_containers_state running, waiting, terminated containers
k8s_state.node_init_containers_state running, waiting, terminated containers
k8s_state.node_age age seconds

Per pod

These metrics refer to the Pod.

Labels:

Label Description
k8s_cluster_id Cluster ID. This is equal to the kube-system namespace UID.
k8s_cluster_name Cluster name. Cluster name discovery only works in GKE.
k8s_node_name Node name.
k8s_namespace Namespace.
k8s_controller_kind Controller kind (ReplicaSet, DaemonSet, StatefulSet, Job, etc.).
k8s_controller_name Controller name.
k8s_pod_name Pod name.
k8s_qos_class Pod QOS class (burstable, guaranteed, besteffort).

Metrics:

Metric Dimensions Unit
k8s_state.pod_cpu_requests_used requests millicpu
k8s_state.pod_cpu_limits_used limits millicpu
k8s_state.pod_mem_requests_used requests bytes
k8s_state.pod_mem_limits_used limits bytes
k8s_state.pod_condition pod_ready, pod_scheduled, pod_initialized, containers_ready state
k8s_state.pod_phase running, failed, succeeded, pending state
k8s_state.pod_status_reason Evicted, NodeAffinity, NodeLost, Shutdown, UnexpectedAdmissionError, Other status
k8s_state.pod_age age seconds
k8s_state.pod_containers containers, init_containers containers
k8s_state.pod_containers_state running, waiting, terminated containers
k8s_state.pod_init_containers_state running, waiting, terminated containers

Per container

These metrics refer to the Pod container.

Labels:

Label Description
k8s_cluster_id Cluster ID. This is equal to the kube-system namespace UID.
k8s_cluster_name Cluster name. Cluster name discovery only works in GKE.
k8s_node_name Node name.
k8s_namespace Namespace.
k8s_controller_kind Controller kind (ReplicaSet, DaemonSet, StatefulSet, Job, etc.).
k8s_controller_name Controller name.
k8s_pod_name Pod name.
k8s_qos_class Pod QOS class (burstable, guaranteed, besteffort).
k8s_container_name Container name.

Metrics:

Metric Dimensions Unit
k8s_state.pod_container_readiness_state ready state
k8s_state.pod_container_restarts restarts restarts
k8s_state.pod_container_state running, waiting, terminated state
k8s_state.pod_container_waiting_state_reason ContainerCreating, CrashLoopBackOff, CreateContainerConfigError, CreateContainerError, ErrImagePull, ImagePullBackOff, InvalidImageName, PodInitializing, Other state
k8s_state.pod_container_terminated_state_reason Completed, ContainerCannotRun, DeadlineExceeded, Error, Evicted, OOMKilled, Other state

Alerts

There are no alerts configured by default for this integration.

Troubleshooting

Debug Mode

Important: Debug mode is not supported for data collection jobs created via the UI using the Dyncfg feature.

To troubleshoot issues with the k8s_state collector, run the go.d.plugin with the debug option enabled. The output should give you clues as to why the collector isn’t working.

  • Navigate to the plugins.d directory, usually at /usr/libexec/netdata/plugins.d/. If that’s not the case on your system, open netdata.conf and look for the plugins setting under [directories].

    cd /usr/libexec/netdata/plugins.d/
    
  • Switch to the netdata user.

    sudo -u netdata -s
    
  • Run the go.d.plugin to debug the collector:

    ./go.d.plugin -d -m k8s_state
    

Getting Logs

If you’re encountering problems with the k8s_state collector, follow these steps to retrieve logs and identify potential issues:

  • Run the command specific to your system (systemd, non-systemd, or Docker container).
  • Examine the output for any warnings or error messages that might indicate issues. These messages should provide clues about the root cause of the problem.

System with systemd

Use the following command to view logs generated since the last Netdata service restart:

journalctl _SYSTEMD_INVOCATION_ID="$(systemctl show --value --property=InvocationID netdata)" --namespace=netdata --grep k8s_state

System without systemd

Locate the collector log file, typically at /var/log/netdata/collector.log, and use grep to filter for collector’s name:

grep k8s_state /var/log/netdata/collector.log

Note: This method shows logs from all restarts. Focus on the latest entries for troubleshooting current issues.

Docker Container

If your Netdata runs in a Docker container named “netdata” (replace if different), use this command:

docker logs netdata 2>&1 | grep k8s_state

The observability platform companies need to succeed

Sign up for free

Want a personalised demo of Netdata for your use case?

Book a Demo