Kubernetes

Kubernetes

Overview

Run the Datadog Agent in your Kubernetes cluster as a DaemonSet in order to start collecting your cluster and applications metrics, traces, and logs. You can deploy it with a Helm chart or directly with a DaemonSet object YAML definition.

Note: Agent version 6.0 and above only support versions of Kubernetes higher than 1.7.6. For prior versions of Kubernetes, consult the Legacy Kubernetes versions section.

Installation

Notes:

To install the chart with a custom release name, <RELEASE_NAME> (e.g. datadog-agent):

  1. Install Helm.
  2. Using the Datadog values.yaml configuration file as a reference, create your values.yaml. Datadog recommends that your values.yaml only contain values that need to be overridden, as it allows a smooth experience when upgrading chart versions.
  3. If this is a fresh install, add the Helm Datadog repo:
    helm repo add datadog https://helm.datadoghq.com
    helm repo update
    
  4. Retrieve your Datadog API key from your Agent installation instructions and run:
  • Helm v3+

    helm install <RELEASE_NAME> -f values.yaml  --set datadog.apiKey=<DATADOG_API_KEY> datadog/datadog --set targetSystem=<TARGET_SYSTEM>
    

    Replace <TARGET_SYSTEM> with the name of your OS: linux or windows.

  • Helm v1/v2

    helm install -f values.yaml --name <RELEASE_NAME> --set datadog.apiKey=<DATADOG_API_KEY> datadog/datadog
    

This chart adds the Datadog Agent to all nodes in your cluster via a DaemonSet. It also optionally deploys the kube-state-metrics chart and uses it as an additional source of metrics about the cluster. A few minutes after installation, Datadog begins to report hosts and metrics.

Next, enable the Datadog features that you’d like to use: APM, Logs

Notes:

  • For a full list of the Datadog chart’s configurable parameters and their default values, refer to the Datadog Helm repository README.

  • If Google Container Registry (gcr.io/datadoghq) is not accessible in your deployment region, use the Docker Hub registry with the images datadog/agent and datadog/cluster-agent with the following configuration in the values.yaml file:

    agents:
      image:
        repository: datadog/agent
    
    clusterAgent:
      image:
        repository: datadog/cluster-agent
    
    clusterChecksRunner:
      image:
        repository: datadog/agent
    

Upgrading from chart v1.x

The Datadog chart has been refactored in v2.0 to regroup the values.yaml parameters in a more logical way.

If your current chart version deployed is earlier than v2.0.0, follow the migration guide to map your previous settings with the new fields.

Unprivileged

(Optional) To run an unprivileged installation, add the following in the values.yaml file:

datadog:
  securityContext:
      runAsUser: <USER_ID>
      supplementalGroups:
        - <DOCKER_GROUP_ID>

where <USER_ID> is the UID to run the agent and <DOCKER_GROUP_ID> is the group ID owning the docker or containerd socket.

Take advantage of DaemonSets to deploy the Datadog Agent on all your nodes (or on specific nodes by using nodeSelectors).

To install the Datadog Agent on your Kubernetes cluster:

  1. Configure Agent permissions: If your Kubernetes has role-based access control (RBAC) enabled, configure RBAC permissions for your Datadog Agent service account. From Kubernetes 1.6 onwards, RBAC is enabled by default. Create the appropriate ClusterRole, ServiceAccount, and ClusterRoleBinding with the following command:

    kubectl apply -f "https://raw.githubusercontent.com/DataDog/datadog-agent/master/Dockerfiles/manifests/rbac/clusterrole.yaml"
    
    kubectl apply -f "https://raw.githubusercontent.com/DataDog/datadog-agent/master/Dockerfiles/manifests/rbac/serviceaccount.yaml"
    
    kubectl apply -f "https://raw.githubusercontent.com/DataDog/datadog-agent/master/Dockerfiles/manifests/rbac/clusterrolebinding.yaml"
    

    Note: Those RBAC configurations are set for the default namespace by default. If you are in a custom namespace, update the namespace parameter before applying them.

  2. Create a secret that contains your Datadog API Key. Replace the <DATADOG_API_KEY> below with the API key for your organization. This secret is used in the manifest to deploy the Datadog Agent.

    kubectl create secret generic datadog-agent --from-literal='api-key=<DATADOG_API_KEY>' --namespace="default"
    

    Note: This creates a secret in the default namespace. If you are in a custom namespace, update the namespace parameter of the command before running it.

  3. Create the Datadog Agent manifest. Create the datadog-agent.yaml manifest out of one of the following templates:

    MetricsLogsAPMProcessNPMLinuxWindows
    Manifest templateManifest template
    Manifest templateManifest template
    Manifest templateManifest template
    Manifest templateManifest template
    Manifest templateno template
    Manifest templateManifest template

    To enable trace collection completely, extra steps are required on your application Pod configuration. Refer also to the logs, APM, processes, and Network Performance Monitoring documentation pages to learn how to enable each feature individually.

    Note: Those manifests are set for the default namespace by default. If you are in a custom namespace, update the metadata.namespace parameter before applying them.

  4. Set your Datadog site to using the DD_SITE environment variable in the datadog-agent.yaml manifest.

    Note: If the DD_SITE environment variable is not explicitly set, it defaults to the US site datadog.com. If you are using one of the other sites (EU, US3, or US1-FED) this will result in an invalid API key message. Use the documentation site selector to see documentation appropriate for the site you’re using.

  5. Deploy the DaemonSet with the command:

    kubectl apply -f datadog-agent.yaml
    
  6. Verification: To verify the Datadog Agent is running in your environment as a DaemonSet, execute:

    kubectl get daemonset
    

    If the Agent is deployed, you will see output similar to the text below, where DESIRED and CURRENT are equal to the number of nodes running in your cluster.

    NAME            DESIRED   CURRENT   READY     UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
    datadog-agent   2         2         2         2            2           <none>          10s
    
  7. Optional - Setup Kubernetes State metrics: Download the Kube-State manifests folder and apply them to your Kubernetes cluster to automatically collects kube-state metrics:

    kubectl apply -f <NAME_OF_THE_KUBE_STATE_MANIFESTS_FOLDER>
    

Unprivileged

(Optional) To run an unprivileged installation, add the following to your pod template:

  spec:
    securityContext:
      runAsUser: <USER_ID>
      supplementalGroups:
        - <DOCKER_GROUP_ID>

where <USER_ID> is the UID to run the agent and <DOCKER_GROUP_ID> is the group ID owning the docker or containerd socket.

The Datadog Operator is in public beta. If you have any feedback or questions, contact Datadog support.

The Datadog Operator is a way to deploy the Datadog Agent on Kubernetes and OpenShift. It reports deployment status, health, and errors in its Custom Resource status, and it limits the risk of misconfiguration thanks to higher-level configuration options.

Prerequisites

Using the Datadog Operator requires the following prerequisites:

  • Kubernetes Cluster version >= v1.14.X: Tests were done on versions >= 1.14.0. Still, it should work on versions >= v1.11.0. For earlier versions, because of limited CRD support, the Operator may not work as expected.
  • Helm for deploying the datadog-operator.
  • Kubectl CLI for installing the datadog-agent.

Deploy an Agent with the Operator

To deploy the Datadog Agent with the operator in the minimum number of steps, see the datadog-operator helm chart. Here are the steps:

  1. Install the Datadog Operator:

    helm repo add datadog https://helm.datadoghq.com
    helm install my-datadog-operator datadog/datadog-operator
    
  2. Create a Kubernetes secret with your API and app keys

    kubectl create secret generic datadog-secret --from-literal api-key=<DATADOG_API_KEY> --from-literal app-key=<DATADOG_APP_KEY>
    

    Replace <DATADOG_API_KEY> and <DATADOG_APP_KEY> with your Datadog API and application keys

  3. Create a file with the spec of your Datadog Agent deployment configuration. The simplest configuration is:

    apiVersion: datadoghq.com/v1alpha1
    kind: DatadogAgent
    metadata:
      name: datadog
    spec:
      credentials:
        apiSecret:
          secretName: datadog-secret
          keyName: api-key
        appSecret:
          secretName: datadog-secret
          keyName: app-key
      agent:
        image:
          name: "gcr.io/datadoghq/agent:latest"
      clusterAgent:
        image:
          name: "gcr.io/datadoghq/cluster-agent:latest"
    
  4. Deploy the Datadog Agent with the above configuration file:

    kubectl apply -f /path/to/your/datadog-agent.yaml
    

Cleanup

The following command deletes all the Kubernetes resources created by the above instructions:

kubectl delete datadogagent datadog
helm delete my-datadog-operator

For further details on setting up Operator, including information about using tolerations, refer to the Datadog Operator advanced setup guide.

Unprivileged

(Optional) To run an unprivileged installation, add the following to the Datadog custom resource (CR):

agent:
  config:
    securityContext:
      runAsUser: <USER_ID>
      supplementalGroups:
        - <DOCKER_GROUP_ID>

where <USER_ID> is the UID to run the agent and <DOCKER_GROUP_ID> is the group ID owning the Docker or containerd socket.

Additional configuration

Kubernetes resources for live containers

The Datadog Agent and Cluster Agent can be configured to retrieve Kubernetes resources for Live Containers. This feature allows you to monitor the state of pods, deployments and other Kubernetes concepts in a specific namespace or availability zone, view resource specifications for failed pods within a deployment, correlate node activity with related logs, and more.

See the Live Containers documentation for configuration instructions and additional information.

Event collection

Set the datadog.leaderElection, datadog.collectEvents and agents.rbac.create options to true in your value.yaml file in order to enable Kubernetes event collection.

If you want to collect events from your Kubernetes cluster set the environment variables DD_COLLECT_KUBERNETES_EVENTS and DD_LEADER_ELECTION to true in your Agent manifest. Alternatively, use the Datadog Cluster Agent Event collection

Set agent.config.collectEvents to true in your datadog-agent.yaml manifest.

For example:

agent:
  config:
    collectEvents: true

Integrations

Once the Agent is up and running in your cluster, use Datadog’s Autodiscovery feature to collect metrics and logs automatically from your pods.

Environment variables

Find below the list of environment variables available for the Datadog Agent. If you want to setup those with Helm, see the full list of configuration options for the datadog-value.yaml file in the helm/charts Github repository.

Global options

Env VariableDescription
DD_API_KEYYour Datadog API key (required)
DD_ENVSets the global env tag for all data emitted.
DD_HOSTNAMEHostname to use for metrics (if autodetection fails)
DD_TAGSHost tags separated by spaces. For example: simple-tag-0 tag-key-1:tag-value-1
DD_SITEDestination site for your metrics, traces, and logs. Your DD_SITE is . Defaults to datadoghq.com.
DD_DD_URLOptional setting to override the URL for metric submission.
DD_CHECK_RUNNERSThe Agent runs all checks concurrently by default (default value = 4 runners). To run the checks sequentially, set the value to 1. If you need to run a high number of checks (or slow checks) the collector-queue component might fall behind and fail the healthcheck. You can increase the number of runners to run checks in parallel.
DD_LEADER_ELECTIONIf multiple Agent are running in your cluster, set this variable to true to avoid the duplication of event collection.

Proxy settings

Starting with Agent v6.4.0 (and v6.5.0 for the Trace Agent), you can override the Agent proxy settings with the following environment variables:

Env VariableDescription
DD_PROXY_HTTPAn HTTP URL to use as a proxy for http requests.
DD_PROXY_HTTPSAn HTTPS URL to use as a proxy for https requests.
DD_PROXY_NO_PROXYA space-separated list of URLs for which no proxy should be used.
DD_SKIP_SSL_VALIDATIONAn option to test if the Agent is having issues connecting to Datadog.

For more information about proxy settings, see the Agent v6 Proxy documentation.

Optional collection Agents

Optional collection Agents are disabled by default for security or performance reasons. Use these environment variables to enable them:

Env VariableDescription
DD_APM_ENABLEDEnable trace collection with the Trace Agent.
DD_LOGS_ENABLEDEnable log collection with the Logs Agent.
DD_PROCESS_AGENT_ENABLEDEnable live process collection with the Process Agent. The live container view is already enabled by default if the Docker socket is available. If set to false, the live process collection and the live container view are disabled.
DD_COLLECT_KUBERNETES_EVENTSEnable event collection with the Agent. If you are running multiple Agent in your cluster, set DD_LEADER_ELECTION to true as well.

To enable the Live Container view, make sure you are running the process agent in addition to setting DD_PROCESS_AGENT_ENABLED to true.

DogStatsD (custom metrics)

Send custom metrics with the StatsD protocol:

Env VariableDescription
DD_DOGSTATSD_NON_LOCAL_TRAFFICListen to DogStatsD packets from other containers (required to send custom metrics).
DD_HISTOGRAM_PERCENTILESThe histogram percentiles to compute (separated by spaces). The default is 0.95.
DD_HISTOGRAM_AGGREGATESThe histogram aggregates to compute (separated by spaces). The default is “max median avg count”.
DD_DOGSTATSD_SOCKETPath to the Unix socket to listen to. Must be in a rw mounted volume.
DD_DOGSTATSD_ORIGIN_DETECTIONEnable container detection and tagging for unix socket metrics.
DD_DOGSTATSD_TAGSAdditional tags to append to all metrics, events, and service checks received by this DogStatsD server, for example: "env:golden group:retrievers".

Learn more about DogStatsD over Unix Domain Sockets.

Tagging

Datadog automatically collects common tags from Kubernetes. To extract even more tags, use the following options:

Env VariableDescription
DD_KUBERNETES_POD_LABELS_AS_TAGSExtract pod labels
DD_KUBERNETES_POD_ANNOTATIONS_AS_TAGSExtract pod annotations

See the Kubernetes Tag Extraction documentation to learn more.

Using secret files

Integration credentials can be stored in Docker or Kubernetes secrets and used in Autodiscovery templates. For more information, see the Secrets Management documentation.

Ignore containers

Exclude containers from logs collection, metrics collection, and Autodiscovery. Datadog excludes Kubernetes and OpenShift pause containers by default. These allowlists and blocklists apply to Autodiscovery only; traces and DogStatsD are not affected. The value for these environment variables support regular expressions.

Env VariableDescription
DD_CONTAINER_INCLUDEAllowlist of containers to include (separated by spaces). Use .* to include all. For example: "image:image_name_1 image:image_name_2", image:.*
DD_CONTAINER_EXCLUDEBlocklist of containers to exclude (separated by spaces). Use .* to exclude all. For example: "image:image_name_3 image:image_name_4", image:.*
DD_CONTAINER_INCLUDE_METRICSAllowlist of containers whose metrics you wish to include.
DD_CONTAINER_EXCLUDE_METRICSBlocklist of containers whose metrics you wish to exclude.
DD_CONTAINER_INCLUDE_LOGSAllowlist of containers whose logs you wish to include.
DD_CONTAINER_EXCLUDE_LOGSBlocklist of containers whose logs you wish to exclude.
DD_AC_INCLUDEDeprecated. Allowlist of containers to include (separated by spaces). Use .* to include all. For example: "image:image_name_1 image:image_name_2", image:.*
DD_AC_EXCLUDEDeprecated. Blocklist of containers to exclude (separated by spaces). Use .* to exclude all. For example: "image:image_name_3 image:image_name_4" (Note: This variable is only honored for Autodiscovery.), image:.*

Additional examples are available on the Container Discover Management page.

Note: The kubernetes.containers.running, kubernetes.pods.running, docker.containers.running, .stopped, .running.total and .stopped.total metrics are not affected by these settings. All containers are counted.

Misc

Env VariableDescription
DD_PROCESS_AGENT_CONTAINER_SOURCEOverrides container source auto-detection to force a single source. e.g "docker", "ecs_fargate", "kubelet"
DD_HEALTH_PORTSet this to 5555 to expose the Agent health check at port 5555.
DD_CLUSTER_NAMESet a custom Kubernetes cluster identifier to avoid host alias collisions. The cluster name can be up to 40 characters with the following restrictions: Lowercase letters, numbers, and hyphens only. Must start with a letter. Must end with a number or a letter.

You can add extra listeners and config providers using the DD_EXTRA_LISTENERS and DD_EXTRA_CONFIG_PROVIDERS environment variables. They are added in addition to the variables defined in the listeners and config_providers section of the datadog.yaml configuration file.

Commands

See the Agent Commands guides to discover all the Docker Agent commands.

Further Reading