Netdata Team

Netdata Team

May 28, 2024

How to automate adding nodes to rooms in Netdata?

Practical examples of automating room assignment

How we organize nodes (and the Netdata agents that are running on those nodes) across different rooms should reflect our architectural decision because the room is a logical container with its own user members and notification rules. So if we are monitoring large infrastructure we should be consistent with these rules and one way to achieve this is to choose automation. Netdata Cloud Terraform Provider lets you automate this by provisioning all the cloud resources and giving you the credentials to spin up the Netdata Agents. In this article, we will concentrate on how in practice we can organize and assign nodes across different rooms in two scenarios, in each of them I’m using non-production installation of the Netdata Agents:

  1. Each Netdata Agent is connected to Netdata Cloud. This architecture is most likely to be provisioned across geographically separated single nodes. In this example provisioning Netdata Agents is being done through the Docker Compose and each agent has its room. The Terraform code looks like this:

    terraform {
      required_providers {
        netdata = {
          source = "netdata/netdata"
        }
      }
      required_version = ">= 1.4.0"
    }
    provider "netdata" {}
    resource "netdata_space" "test" {
      name        = "TestingSpace"
      description = "Created by Terraform"
    }
    resource "netdata_room" "room1" {
      space_id    = netdata_space.test.id
      name        = "TestingRoom1"
      description = "Created by Terraform"
    }
    resource "netdata_room" "room2" {
      space_id    = netdata_space.test.id
      name        = "TestingRoom2"
      description = "Created by Terraform"
    }
    resource "terraform_data" "install_agent" {
      provisioner "local-exec" {
        command = "docker-compose up -d"
        environment = {
          NETDATA_CLAIM_TOKEN  = netdata_space.test.claim_token
          NETDATA_CLAIM_ROOMS1 = netdata_room.room1.id
          NETDATA_CLAIM_ROOMS2 = netdata_room.room2.id
        }
      }
      provisioner "local-exec" {
        when    = destroy
        command = "docker-compose down"
      }
    }
    

    and the docker-compose.yaml:

    services:
      netdata:
        image: netdata/netdata:stable
        container_name: netdata1
        restart: unless-stopped
        hostname: "netdata1"
        cap_add:
          - SYS_PTRACE
          - SYS_ADMIN
        security_opt:
          - apparmor:unconfined
        volumes:
          - /etc/passwd:/host/etc/passwd:ro
          - /etc/group:/host/etc/group:ro
          - /etc/localtime:/etc/localtime:ro
          - /proc:/host/proc:ro
          - /sys:/host/sys:ro
          - /etc/os-release:/host/etc/os-release:ro
          - /var/log:/host/var/log:ro
          - /var/run/docker.sock:/var/run/docker.sock:ro
        environment:
          - NETDATA_CLAIM_TOKEN=${NETDATA_CLAIM_TOKEN}
          - NETDATA_CLAIM_URL=https://app.netdata.cloud
          - NETDATA_CLAIM_ROOMS=${NETDATA_CLAIM_ROOMS1}
      netdata-child:
        image: netdata/netdata:stable
        container_name: netdata2
        restart: unless-stopped
        hostname: "netdata2"
        cap_add:
          - SYS_PTRACE
          - SYS_ADMIN
        security_opt:
          - apparmor:unconfined
        volumes:
          - /etc/passwd:/host/etc/passwd:ro
          - /etc/group:/host/etc/group:ro
          - /etc/localtime:/etc/localtime:ro
          - /proc:/host/proc:ro
          - /sys:/host/sys:ro
          - /etc/os-release:/host/etc/os-release:ro
          - /var/log:/host/var/log:ro
          - /var/run/docker.sock:/var/run/docker.sock:ro
        environment:
          - NETDATA_CLAIM_TOKEN=${NETDATA_CLAIM_TOKEN}
          - NETDATA_CLAIM_URL=https://app.netdata.cloud
          - NETDATA_CLAIM_ROOMS=${NETDATA_CLAIM_ROOMS2}
    

    Each of the Netdata Agents gets its Room ID created right after creating a new space. The claim token is bound with the space.

  2. In this scenario, we’re using streaming replication, when the Netdata Child Agents are streaming to the Netdata Parent Agent which is then connected to the cloud. It is a much more robust approach with all the benefits described here. Choosing this approach by default all Netdata Child Agents associated to a Netdata Parent Agent are connected to the same room, to make the distinguish you use the following automation:

    terraform {
      required_providers {
        netdata = {
          source = "netdata/netdata"
        }
      }
      required_version = ">= 1.4.0"
    }
    
    provider "netdata" {}
    
    resource "netdata_space" "test" {
      name        = "TestingSpace"
      description = "Created by Terraform"
    }
    
    resource "netdata_room" "room1" {
      space_id    = netdata_space.test.id
      name        = "TestingRoom1"
      description = "Created by Terraform"
    }
    
    resource "netdata_room" "room2" {
      space_id    = netdata_space.test.id
      name        = "TestingRoom2"
      description = "Created by Terraform"
    }
    
    resource "netdata_node_room_member" "room1" {
      room_id  = netdata_room.room1.id
      space_id = netdata_space.test.id
      node_names = [
        "netdata-parent"
      ]
    
      depends_on = [
        terraform_data.install_agent
      ]
    }
    
    resource "netdata_node_room_member" "room2" {
      room_id  = netdata_room.room2.id
      space_id = netdata_space.test.id
      node_names = [
        "netdata-child"
      ]
    
      depends_on = [
        terraform_data.install_agent
      ]
    }
    
    resource "terraform_data" "install_agent" {
      provisioner "local-exec" {
        command = "docker-compose up -d && sleep 5"
        environment = {
          NETDATA_CLAIM_TOKEN = netdata_space.test.claim_token
        }
      }
      provisioner "local-exec" {
        when    = destroy
        command = "docker-compose down"
      }
    }
    

    and the docker-compose.yaml:

    services:
      netdata:
        image: netdata/netdata:stable
        container_name: netdata-parent
        restart: unless-stopped
        hostname: "netdata-parent"
        cap_add:
          - SYS_PTRACE
          - SYS_ADMIN
        security_opt:
          - apparmor:unconfined
        volumes:
          - /etc/passwd:/host/etc/passwd:ro
          - /etc/group:/host/etc/group:ro
          - /etc/localtime:/etc/localtime:ro
          - /proc:/host/proc:ro
          - /sys:/host/sys:ro
          - /etc/os-release:/host/etc/os-release:ro
          - /var/log:/host/var/log:ro
          - /var/run/docker.sock:/var/run/docker.sock:ro
          - ./parent-stream.conf:/etc/netdata/stream.conf
        environment:
          - NETDATA_CLAIM_TOKEN=${NETDATA_CLAIM_TOKEN}
          - NETDATA_CLAIM_URL=https://app.netdata.cloud
      netdata-child:
        image: netdata/netdata:stable
        container_name: netdata-child
        restart: unless-stopped
        hostname: "netdata-child"
        cap_add:
          - SYS_PTRACE
          - SYS_ADMIN
        security_opt:
          - apparmor:unconfined
        volumes:
          - /etc/passwd:/host/etc/passwd:ro
          - /etc/group:/host/etc/group:ro
          - /etc/localtime:/etc/localtime:ro
          - /proc:/host/proc:ro
          - /sys:/host/sys:ro
          - /etc/os-release:/host/etc/os-release:ro
          - /var/log:/host/var/log:ro
          - /var/run/docker.sock:/var/run/docker.sock:ro
          - ./child-stream.conf:/etc/netdata/stream.conf
    

    parent-stream.conf:

    [11111111-2222-3333-4444-555555555555]
        enabled = yes
    

    child-stream.conf:

    [stream]
       enabled = yes
       destination = netdata-parent:19999
       api key = 11111111-2222-3333-4444-555555555555
    

    Here we should match the node room members by the node name, which in this case is a hostname. With the resource netdata_node_room_member you can change membership for the nodes already been provisioned to the cloud.

So as you just saw, with only a few lines of code, we can spin up the monitoring and go even further by automating user membership and notification integration.

Discover More