Further Configure the Datadog Agent on Kubernetes

Overview

After you have installed the Datadog Agent in your Kubernetes environment, you may choose additional configuration options.

Enable Datadog to collect:

Other capabilities

More configurations

Enable APM and tracing

Edit your datadog-agent.yaml to set features.apm.enabled to true.

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    credentials:
      apiKey: <DATADOG_API_KEY>

  features:
    apm:
      enabled: true

After making your changes, apply the new configuration by using the following command:

kubectl apply -n $DD_NAMESPACE -f datadog-agent.yaml

In Helm, APM is enabled by default over UDS or Windows named pipe.

To verify, ensure that datadog.apm.socketEnabled is set to true in your values.yaml.

datadog:
  apm:
    socketEnabled: true    

For more information, see Kubernetes Trace Collection.

Enable Kubernetes event collection

Use the Datadog Cluster Agent to collect Kubernetes events.

Event collection is enabled by default by the Datadog Operator. This can be managed in the configuration features.eventCollection.collectKubernetesEvents in your datadog-agent.yaml.

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    credentials:
      apiKey: <DATADOG_API_KEY>
    site: <DATADOG_SITE>

  features:
    eventCollection:
      collectKubernetesEvents: true

To collect Kubernetes events with the Datadog Cluster Agent, ensure that the clusterAgent.enabled, datadog.collectEvents and clusterAgent.rbac.create options are set to true in your datadog-values.yaml file.

datadog:
  collectEvents: true
clusterAgent:
  enabled: true
  rbac: 
    create: true

If you don’t want to use the Cluster Agent, you can still have a Node Agent collect Kubernetes events by setting datadog.leaderElection, datadog.collectEvents, and agents.rbac.create options to true in your datadog-values.yaml file.

datadog:
  leaderElection: true
  collectEvents: true
agents:
  rbac:
    create: true

For DaemonSet configuration, see DaemonSet Cluster Agent event collection.

Enable NPM collection

In your datadog-agent.yaml, set features.npm.enabled to true.

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    credentials:
      apiKey: <DATADOG_API_KEY>

  features:
    npm:
      enabled: true

Then apply the new configuration:

kubectl apply -n $DD_NAMESPACE -f datadog-agent.yaml

Update your datadog-values.yaml with the following configuration:

datadog:
  # (...)
  networkMonitoring:
    enabled: true

Then upgrade your Helm chart:

helm upgrade -f datadog-values.yaml <RELEASE_NAME> datadog/datadog

For more information, see Network Performance Monitoring.

Enable log collection

In your datadog-agent.yaml, set features.logCollection.enabled and features.logCollection.containerCollectAll to true.

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    credentials:
      apiKey: <DATADOG_API_KEY>

  features:
    logCollection:
      enabled: true
      containerCollectAll: true

Then apply the new configuration:

kubectl apply -n $DD_NAMESPACE -f datadog-agent.yaml

Update your datadog-values.yaml with the following configuration:

datadog:
  # (...)
  logs:
    enabled: true
    containerCollectAll: true

Then upgrade your Helm chart:

helm upgrade -f datadog-values.yaml <RELEASE_NAME> datadog/datadog

For more information, see Kubernetes log collection.

Enable process collection

In your datadog-agent.yaml, set features.liveProcessCollection.enabled to true.

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    credentials:
      apiKey: <DATADOG_API_KEY>

  features:
    liveProcessCollection:
      enabled: true

Then apply the new configuration:

kubectl apply -n $DD_NAMESPACE -f datadog-agent.yaml

Update your datadog-values.yaml with the following configuration:

datadog:
  # (...)
  processAgent:
    enabled: true
    processCollection: true

Then upgrade your Helm chart:

helm upgrade -f datadog-values.yaml <RELEASE_NAME> datadog/datadog

For more information, see Live Processes

Datadog Cluster Agent

The Datadog Cluster Agent provides a streamlined, centralized approach to collecting cluster level monitoring data. Datadog strongly recommends using the Cluster Agent for monitoring Kubernetes.

The Datadog Operator v1.0.0+ and Helm chart v2.7.0+ enable the Cluster Agent by default. No further configuration is necessary.

The Datadog Operator v1.0.0+ enables the Cluster Agent by default. The Operator creates the necessary RBACs and deploys the Cluster Agent. Both Agents use the same API key.

The Operator automatically generates a random token in a Kubernetes Secret to be shared by the Datadog Agent and Cluster Agent for secure communication.

You can manually specify this token in the global.clusterAgentToken field in your datadog-agent.yaml:

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    credentials:
      apiKey: <DATADOG_API_KEY>
      appKey: <DATADOG_APP_KEY>
  clusterAgentToken: <DATADOG_CLUSTER_AGENT_TOKEN>

Alternatively, you can specify this token by referencing the name of an existing Secret and the data key containing this token:

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    credentials:
      apiKey: <DATADOG_API_KEY>
      appKey: <DATADOG_APP_KEY>
  clusterAgentTokenSecret: 
    secretName: <SECRET_NAME>
    keyName: <KEY_NAME>

Note: When set manually, this token must be 32 alphanumeric characters.

Then apply the new configuration:

kubectl apply -n $DD_NAMESPACE -f datadog-agent.yaml

Helm chart v2.7.0+ enables the Cluster Agent by default.

For verification, ensure that clusterAgent.enabled is set to true in your datadog-values.yaml:

clusterAgent:
  enabled: true

Helm automatically generates a random token in a Kubernetes Secret to be shared by the Datadog Agent and Cluster Agent for secure communication.

You can manually specify this token in the clusterAgent.token field in your datadog-agent.yaml:

clusterAgent:
  enabled: true
  token: <DATADOG_CLUSTER_AGENT_TOKEN>

Alternatively, you can specify this token by referencing the name of an existing Secret, where the token is in a key named token:

clusterAgent:
  enabled: true
  tokenExistingSecret: <SECRET_NAME>

For more information, see the Datadog Cluster Agent documentation.

Custom metrics server

To use the Cluster Agent’s custom metrics server feature, you must supply a Datadog application key and enable the metrics provider.

In datadog-agent.yaml, supply an application key under spec.global.credentials.appKey and set features.externalMetricsServer.enabled to true.

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    credentials:
      apiKey: <DATADOG_API_KEY>
      appKey: <DATADOG_APP_KEY>

  features:
    externalMetricsServer:
      enabled: true

Then apply the new configuration:

kubectl apply -n $DD_NAMESPACE -f datadog-agent.yaml

In datadog-values.yaml, supply an application key under datadog.appKey and set clusterAgent.metricsProvider.enabled to true.

datadog:
  apiKey: <DATADOG_API_KEY>
  appKey: <DATADOG_APP_KEY>

  clusterAgent:
    enabled: true
    metricsProvider:
      enabled: true

Then upgrade your Helm chart:

helm upgrade -f datadog-values.yaml <RELEASE_NAME> datadog/datadog

Integrations

Once the Agent is up and running in your cluster, use Datadog’s Autodiscovery feature to collect metrics and logs automatically from your pods.

Containers view

To make use of Datadog’s Container Explorer, enable the Process Agent. The Datadog Operator and Helm chart enable the Process Agent by default. No further configuration is necessary.

The Datadog Operator enables the Process Agent by default.

For verification, ensure that features.liveContainerCollection.enabled is set to true in your datadog-agent.yaml:

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    credentials:
      apiKey: <DATADOG_API_KEY>
      appKey: <DATADOG_APP_KEY>
  features:
    liveContainerCollection:
      enabled: true

The Helm chart enables the Process Agent by default.

For verification, ensure that processAgent.enabled is set to true in your datadog-values.yaml:

datadog:
  # (...)
  processAgent:
    enabled: true

In some setups, the Process Agent and Cluster Agent cannot automatically detect a Kubernetes cluster name. If this happens, the feature does not start, and the following warning displays in the Cluster Agent log: Orchestrator explorer enabled but no cluster name set: disabling. In this case, you must set datadog.clusterName to your cluster name in values.yaml.

datadog:
  #(...)
  clusterName: <YOUR_CLUSTER_NAME>
  #(...)
  processAgent:
    enabled: true

See the Containers view documentation for additional information.

Orchestrator Explorer

The Datadog Operator and Helm chart enable Datadog’s Orchestrator Explorer by default. No further configuration is necessary.

The Orchestrator Explorer is enabled in the Datadog Operator by default.

For verification, ensure that the features.orchestratorExplorer.enabled parameter is set to true in your datadog-agent.yaml:

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    credentials:
      apiKey: <DATADOG_API_KEY>
      appKey: <DATADOG_APP_KEY>
  features:
    orchestratorExplorer:
      enabled: true

The Helm chart enables Orchestrator Explorer by default.

For verification, ensure that the orchestratorExplorer.enabled parameter is set to true in your datadog-values.yaml file:

datadog:
  # (...)
  processAgent:
    enabled: true
  orchestratorExplorer:
    enabled: true

See the Orchestrator Explorer documentation for additional information.

Environment variables

Use the following environment variables to configure the Datadog Agent.

Parameter (v2alpha1)Description
global.credentials.apiKeyConfigures your Datadog API key.
global.credentials.apiSecret.secretNameInstead of global.credentials.apiKey, supply the name of a Kubernetes Secret containing your Datadog API key.
global.credentials.apiSecret.keyNameInstead of global.credentials.apiKey, supply the key of the Kubernetes Secret named in global.credentials.apiSecret.secretName.
global.credentials.appKeyConfigures your Datadog application key. If you are using the external metrics server, you must set a Datadog application key for read access to your metrics.
global.credentials.appSecret.secretNameInstead of global.credentials.apiKey, supply the name of a Kubernetes Secret containing your Datadog app key.
global.credentials.appSecret.keyNameInstead of global.credentials.apiKey, supply the key of the Kubernetes Secret named in global.credentials.appSecret.secretName.
global.logLevelSets logging verbosity. This can be overridden by the container. Valid log levels are: trace, debug, info, warn, error, critical, and off. Default: info.
global.registryImage registry to use for all Agent images. Default: gcr.io/datadoghq.
global.siteSets the Datadog intake site to which Agent data is sent. Your site is . (Ensure the correct SITE is selected on the right).
global.tagsA list of tags to attach to every metric, event, and service check collected.

For a complete list of environment variables for the Datadog Operator, see the Operator v2alpha1 spec. For older versions, see the Operator v1alpha1 spec.

HelmDescription
datadog.apiKeyConfigures your Datadog API key.
datadog.apiKeyExistingSecretInstead of datadog.apiKey, supply the name of an existing Kubernetes Secret containing your Datadog API key, set with the key name api-key.
datadog.appKeyConfigures your Datadog application key. If you are using the external metrics server, you must set a Datadog application key for read access to your metrics.
datadog.appKeyExistingSecretInstead of datadog.appKey, supply the name of an existing Kubernetes Secret containing your Datadog app key, set with the key name app-key.
datadog.logLevelSets logging verbosity. This can be overridden by the container. Valid log levels are: trace, debug, info, warn, error, critical, and off. Default: info.
registryImage registry to use for all Agent images. Default: gcr.io/datadoghq.
datadog.siteSets the Datadog intake site to which Agent data is sent. Your site is . (Ensure the correct SITE is selected on the right).
datadog.tagsA list of tags to attach to every metric, event, and service check collected.

For a complete list of environment variables for the Helm chart, see the full list of options for datadog-values.yaml.

Env VariableDescription
DD_API_KEYYour Datadog API key (required)
DD_ENVSets the global env tag for all data emitted.
DD_HOSTNAMEHostname to use for metrics (if autodetection fails)
DD_TAGSHost tags separated by spaces. For example: simple-tag-0 tag-key-1:tag-value-1
DD_SITEDestination site for your metrics, traces, and logs. Your DD_SITE is . Defaults to datadoghq.com.
DD_DD_URLOptional setting to override the URL for metric submission.
DD_URL (6.36+/7.36+)Alias for DD_DD_URL. Ignored if DD_DD_URL is already set.
DD_CHECK_RUNNERSThe Agent runs all checks concurrently by default (default value = 4 runners). To run the checks sequentially, set the value to 1. If you need to run a high number of checks (or slow checks) the collector-queue component might fall behind and fail the healthcheck. You can increase the number of runners to run checks in parallel.
DD_LEADER_ELECTIONIf multiple instances of the Agent are running in your cluster, set this variable to true to avoid the duplication of event collection.

Configure DogStatsD

DogStatsD can send custom metrics over UDP with the StatsD protocol. DogStatsD is enabled by default by the Datadog Operator and Helm. See the DogStatsD documentation for more information.

You can use the following environment variables to configure DogStatsD with DaemonSet:

Env VariableDescription
DD_DOGSTATSD_NON_LOCAL_TRAFFICListen to DogStatsD packets from other containers (required to send custom metrics).
DD_HISTOGRAM_PERCENTILESThe histogram percentiles to compute (separated by spaces). The default is 0.95.
DD_HISTOGRAM_AGGREGATESThe histogram aggregates to compute (separated by spaces). The default is "max median avg count".
DD_DOGSTATSD_SOCKETPath to the Unix socket to listen to. Must be in a rw mounted volume.
DD_DOGSTATSD_ORIGIN_DETECTIONEnable container detection and tagging for Unix socket metrics.
DD_DOGSTATSD_TAGSAdditional tags to append to all metrics, events, and service checks received by this DogStatsD server, for example: "env:golden group:retrievers".

Configure tag mapping

Datadog automatically collects common tags from Kubernetes.

In addition, you can map Kubernetes node labels, pod labels, and annotations to Datadog tags. Use the following environment variables to configure this mapping:

Parameter (v2alpha1)Description
global.namespaceLabelsAsTagsProvide a mapping of Kubernetes namespace labels to Datadog tags. <KUBERNETES_NAMESPACE_LABEL>: <DATADOG_TAG_KEY>
global.nodeLabelsAsTagsProvide a mapping of Kubernetes node labels to Datadog tags. <KUBERNETES_NODE_LABEL>: <DATADOG_TAG_KEY>
global.podAnnotationsAsTagsProvide a mapping of Kubernetes Annotations to Datadog tags. <KUBERNETES_ANNOTATION>: <DATADOG_TAG_KEY>
global.podLabelsAsTagsProvide a mapping of Kubernetes labels to Datadog tags. <KUBERNETES_LABEL>: <DATADOG_TAG_KEY>

Examples

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    credentials:
      apiKey: <DATADOG_API_KEY>
    namespaceLabelsAsTags:
      env: environment
      # <KUBERNETES_NAMESPACE_LABEL>: <DATADOG_TAG_KEY>
    nodeLabelsAsTags:
      beta.kubernetes.io/instance-type: aws-instance-type
      kubernetes.io/role: kube_role
      # <KUBERNETES_NODE_LABEL>: <DATADOG_TAG_KEY>
    podLabelsAsTags:
      app: kube_app
      release: helm_release
      # <KUBERNETES_LABEL>: <DATADOG_TAG_KEY>
    podAnnotationsAsTags:
      iam.amazonaws.com/role: kube_iamrole
       # <KUBERNETES_ANNOTATIONS>: <DATADOG_TAG_KEY>
HelmDescription
datadog.namespaceLabelsAsTagsProvide a mapping of Kubernetes namespace labels to Datadog tags. <KUBERNETES_NAMESPACE_LABEL>: <DATADOG_TAG_KEY>
datadog.nodeLabelsAsTagsProvide a mapping of Kubernetes node labels to Datadog tags. <KUBERNETES_NODE_LABEL>: <DATADOG_TAG_KEY>
datadog.podAnnotationsAsTagsProvide a mapping of Kubernetes Annotations to Datadog tags. <KUBERNETES_ANNOTATION>: <DATADOG_TAG_KEY>
datadog.podLabelsAsTagsProvide a mapping of Kubernetes labels to Datadog tags. <KUBERNETES_LABEL>: <DATADOG_TAG_KEY>

Examples

datadog:
  # (...)
  namespaceLabelsAsTags:
    env: environment
    # <KUBERNETES_NAMESPACE_LABEL>: <DATADOG_TAG_KEY>
  nodeLabelsAsTags:
    beta.kubernetes.io/instance-type: aws-instance-type
    kubernetes.io/role: kube_role
    # <KUBERNETES_NODE_LABEL>: <DATADOG_TAG_KEY>
  podLabelsAsTags:
    app: kube_app
    release: helm_release
    # <KUBERNETES_LABEL>: <DATADOG_TAG_KEY>
  podAnnotationsAsTags:
    iam.amazonaws.com/role: kube_iamrole
     # <KUBERNETES_ANNOTATIONS>: <DATADOG_TAG_KEY>

Using secret files

Integration credentials can be stored in Docker or Kubernetes secrets and used in Autodiscovery templates. For more information, see Secrets Management.

Ignore containers

Exclude containers from logs collection, metrics collection, and Autodiscovery. Datadog excludes Kubernetes and OpenShift pause containers by default. These allowlists and blocklists apply to Autodiscovery only; traces and DogStatsD are not affected. These environment variables support regular expressions in their values.

See the Container Discover Management page for examples.

Note: The kubernetes.containers.running, kubernetes.pods.running, docker.containers.running, .stopped, .running.total and .stopped.total metrics are not affected by these settings. All containers are counted.

Kubernetes API server timeout

By default, the Kubernetes State Metrics Core check waits 10 seconds for a response from the Kubernetes API server. For large clusters, the request may time out, resulting in missing metrics.

You can avoid this by setting the environment variable DD_KUBERNETES_APISERVER_CLIENT_TIMEOUT to a higher value than the default 10 seconds.

Update your datadog-agent.yaml with the following configuration:

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  override:
    clusterAgent:
      env:
        - name: DD_KUBERNETES_APISERVER_CLIENT_TIMEOUT
          value: <value_greater_than_10>

Then apply the new configuration:

kubectl apply -n $DD_NAMESPACE -f datadog-agent.yaml

Update your datadog-values.yaml with the following configuration:

clusterAgent:
  env:
    - name: DD_KUBERNETES_APISERVER_CLIENT_TIMEOUT
      value: <value_greater_than_10>

Then upgrade your Helm chart:

helm upgrade -f datadog-values.yaml <RELEASE_NAME> datadog/datadog

Proxy settings

Starting with Agent v6.4.0 (and v6.5.0 for the Trace Agent), you can override the Agent proxy settings with the following environment variables:

Env VariableDescription
DD_PROXY_HTTPAn HTTP URL to use as a proxy for http requests.
DD_PROXY_HTTPSAn HTTPS URL to use as a proxy for https requests.
DD_PROXY_NO_PROXYA space-separated list of URLs for which no proxy should be used.
DD_SKIP_SSL_VALIDATIONAn option to test if the Agent is having issues connecting to Datadog.

Autodiscovery

Env VariableDescription
DD_LISTENERSAutodiscovery listeners to run.
DD_EXTRA_LISTENERSAdditional Autodiscovery listeners to run. They are added in addition to the variables defined in the listeners section of the datadog.yaml configuration file.
DD_CONFIG_PROVIDERSThe providers the Agent should call to collect checks configurations. Available providers are:
kubelet - Handles templates embedded in pod annotations.
docker - Handles templates embedded in container labels.
clusterchecks - Retrieves cluster-level check configurations from the Cluster Agent.
kube_services - Watches Kubernetes services for cluster checks.
DD_EXTRA_CONFIG_PROVIDERSAdditional Autodiscovery configuration providers to use. They are added in addition to the variables defined in the config_providers section of the datadog.yaml configuration file.

Miscellaneous

Env VariableDescription
DD_PROCESS_AGENT_CONTAINER_SOURCEOverrides container source auto-detection to force a single source. e.g "docker", "ecs_fargate", "kubelet". This is no longer needed since Agent v7.35.0.
DD_HEALTH_PORTSet this to 5555 to expose the Agent health check at port 5555.
DD_CLUSTER_NAMESet a custom Kubernetes cluster identifier to avoid host alias collisions. The cluster name can be up to 40 characters with the following restrictions: Lowercase letters, numbers, and hyphens only. Must start with a letter. Must end with a number or a letter.
DD_COLLECT_KUBERNETES_EVENTSEnable event collection with the Agent. If you are running multiple instances of the Agent in your cluster, set DD_LEADER_ELECTION to true as well.