Cluster Agent Setup

Cluster Agent Setup

To set up the Datadog Cluster Agent on your Kubernetes cluster, follow these steps:

To enable the Cluster Agent collection with Helm, update your datadog-values.yaml file with the following Cluster Agent configuration, then upgrade your Datadog Helm chart:

clusterAgent:
  # clusterAgent.enabled -- Set this to false to disable Datadog Cluster Agent
  enabled: true

This automatically updates the necessary RBAC files for the Cluster Agent and Datadog Agent. Both Agents use the same API key.

This also automatically generates a random token in a Secret shared between both the Cluster Agent and the Datadog Agent. You can manually set this by specifying a token in the clusterAgent.token configuration. You can also manually set this by specifying an existing Secret name containing a token value through the clusterAgent.tokenExistingSecret configuration.

When set manually this token must be 32 alphanumeric characters.

  1. Set up the Datadog Cluster Agent.
  2. Configure your Agent to communicate with the Datadog Cluster Agent

Configure the Datadog Cluster Agent

Configure RBAC permissions

The Datadog Cluster Agent needs a proper RBAC to be up and running:

  1. Review the manifests in the Datadog Cluster Agent RBAC folder. Note that when using the Cluster Agent, your node Agents are not able to interact with the Kubernetes API server—only the Cluster Agent is able to do so.

  2. To configure Cluster Agent RBAC permissions, apply the following manifests. (You may have done this already when setting up the node Agent daemonset.)

kubectl apply -f "https://raw.githubusercontent.com/DataDog/datadog-agent/master/Dockerfiles/manifests/cluster-agent/rbac.yaml"
kubectl apply -f "https://raw.githubusercontent.com/DataDog/datadog-agent/master/Dockerfiles/manifests/cluster-agent/cluster-agent-rbac.yaml"

This creates the appropriate ServiceAccount, ClusterRole, and ClusterRoleBinding for the Cluster Agent and updates the ClusterRole for the node Agent.

If you are using Azure Kubernetes Service (AKS), you may require extra permissions. See the RBAC for DCA on AKS FAQ.

Secure Cluster Agent to Agent communication

The Datadog Agent and Cluster Agent require a token to secure their communication. It is recommended that you save this token in a Secret that both the Datadog Agent and Cluster Agent can reference in the environment variable DD_CLUSTER_AGENT_AUTH_TOKEN. This helps to maintain consistency and to avoid the token being readable in the PodSpec.

To create this token run this one line command to generate a Secret named datadog-cluster-agent with a token set. Replace the <TOKEN> with 32 alphanumeric characters.

kubectl create secret generic datadog-cluster-agent --from-literal=token='<TOKEN>' --namespace="default"

Note: This creates a Secret in the default namespace. If you are in a custom namespace, update the namespace parameter of the command before running it.

The default cluster-agent-deployment.yaml provided for the Cluster Agent is already configured to refer to this Secret with the environment variable configuration:

- name: DD_CLUSTER_AGENT_AUTH_TOKEN
  valueFrom:
    secretKeyRef:
      name: datadog-cluster-agent
      key: token

This environment variable must be configured (using the same setup) when Configuring the Datadog Agent.

Create the Cluster Agent and its service

  1. Download the following manifests:

  2. In the secret-api-key.yaml manifest, replace PUT_YOUR_BASE64_ENCODED_API_KEY_HERE with your Datadog API key encoded in base64. To get the base64 version of your API key, you can run:

    echo -n '<Your API key>' | base64
    
  3. In the secrets-application-key.yaml manifest, replace PUT_YOUR_BASE64_ENCODED_APP_KEY_HERE with your Datadog Application key encoded in base64.

  4. The cluster-agent-deployment.yaml manifest will refers to the token created previously in the Secret datadog-cluster-agent by default. If you are storing this token in an alternative way, configure your DD_CLUSTER_AGENT_AUTH_TOKEN environment variable accordingly.

  5. Deploy these resources for the Cluster Agent Deployment to use:

    kubectl apply -f agent-services.yaml
    kubectl apply -f secret-api-key.yaml
    kubectl apply -f secret-application-key.yaml
    kubectl apply -f install_info-configmap.yaml
    
  6. Finally, deploy the Datadog Cluster Agent:

    kubectl apply -f cluster-agent-deployment.yaml
    

Note: In your Datadog Cluster Agent, set the environment variable DD_SITE to your Datadog site: . It defaults to the US site datadoghq.com

Verification

At this point, you should see:

$ kubectl get deploy

NAME                    DESIRED   CURRENT   UP-TO-DATE   AVAILABLE   AGE
datadog-cluster-agent   1         1         1            1           1d

$ kubectl get secret

NAME                    TYPE                                  DATA      AGE
datadog-cluster-agent   Opaque                                1         1d

$ kubectl get pods -l app=datadog-cluster-agent

datadog-cluster-agent-8568545574-x9tc9   1/1       Running   0          2h

$ kubectl get service -l app=datadog-cluster-agent

NAME                    TYPE           CLUSTER-IP       EXTERNAL-IP        PORT(S)          AGE
datadog-cluster-agent   ClusterIP      10.100.202.234   none               5005/TCP         1d

Note: If you already have the Datadog Agent running, you may need to apply the Agent’s rbac.yaml manifest before the Cluster Agent can start running.

Configure the Datadog Agent

After having set up the Datadog Cluster Agent, modify your Datadog Agent configuration to communicate with the Datadog Cluster Agent. You can reference the provided daemonset.yaml manifest for a full example.

In your existing Daemonset manifest file set the environment variable DD_CLUSTER_AGENT_ENABLED to true. Then, set the DD_CLUSTER_AGENT_AUTH_TOKEN using the same syntax used in Secure Cluster-Agent-to-Agent Communication.

- name: DD_CLUSTER_AGENT_ENABLED
  value: "true"
- name: DD_CLUSTER_AGENT_AUTH_TOKEN
  valueFrom:
    secretKeyRef:
      name: datadog-cluster-agent
      key: token

After redeploying your Daemonset with these configurations in place, the Datadog Agent is able to communicate with the Cluster Agent.

Verification

You can verify your Datadog Agent pods and Cluster Agent pods are running by executing the command:

kubectl get pods | grep agent

You should see:

datadog-agent-4k9cd                      1/1       Running   0          2h
datadog-agent-4v884                      1/1       Running   0          2h
datadog-agent-9d5bl                      1/1       Running   0          2h
datadog-agent-dtlkg                      1/1       Running   0          2h
datadog-agent-jllww                      1/1       Running   0          2h
datadog-agent-rdgwz                      1/1       Running   0          2h
datadog-agent-x5wk5                      1/1       Running   0          2h
[...]
datadog-cluster-agent-8568545574-x9tc9   1/1       Running   0          2h

You can additionally verify the Datadog Agent has successfully connected to the Cluster Agent with the Agent status output.

kubectl exec -it <AGENT_POD_NAME> agent status
[...]
=====================
Datadog Cluster Agent
=====================

  - Datadog Cluster Agent endpoint detected: https://10.104.246.194:5005
  Successfully connected to the Datadog Cluster Agent.
  - Running: 1.11.0+commit.4eadd95

Kubernetes events are beginning to flow into your Datadog account, and relevant metrics collected by your Agents are tagged with their corresponding cluster level metadata.

Monitoring AWS managed services

To monitor an AWS managed service like MSK, ElastiCache, or RDS, set clusterChecksRunner to create a pod with an IAM role assigned through the serviceAccountAnnotation in the Helm chart. Then, set the integration configurations under clusterAgent.confd.

clusterChecksRunner:
  enabled: true
  rbac:
    # clusterChecksRunner.rbac.create -- If true, create & use RBAC resources
    create: true
    dedicated: true
    serviceAccountAnnotations:
      eks.amazonaws.com/role-arn: arn:aws:iam::***************:role/ROLE-NAME-WITH-MSK-READONLY-POLICY
clusterAgent:
  confd:
    amazon_msk.yaml: |-
      cluster_check: true
      instances:
        - cluster_arn: arn:aws:kafka:us-west-2:*************:cluster/gen-kafka/*******-8e12-4fde-a5ce-******-3
          region_name: us-west-2      

Further Reading