Kubernetes APM - Trace Collection

This page describes how to set up and configure Application Performance Monitoring (APM) for your Kubernetes application.

The APM troubleshooting pipeline: The tracer sends traces and metrics data from the application pod to the Agent pod, which sends it to the Datadog backend to be shown in the Datadog UI.

You can send traces over Unix Domain Socket (UDS), TCP (IP:Port), or Kubernetes service. Datadog recommends that you use UDS, but it is possible to use all three at the same time, if necessary.

Setup

  1. If you haven’t already, install the Datadog Agent in your Kubernetes environment.
  2. Configure the Datadog Agent to collect traces.
  3. Configure application pods to submit traces to the Datadog Agent.

Configure the Datadog Agent to collect traces

The instructions in this section configure the Datadog Agent to receive traces over UDS. To use TCP, see the additional configuration section. To use Kubernetes service, see Setting up APM with Kubernetes Service.

After you use the Operator to install the Datadog Agent, edit your datadog-agent.yaml to set features.apm.enabled to true.

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    credentials:
      apiKey: <DATADOG_API_KEY>

  features:
    apm:
      enabled: true
      unixDomainSocketConfig:
        path: /var/run/datadog/apm.socket # default

When APM is enabled, the default configuration creates a directory on the host and mounts it within the Agent. The Agent then creates and listens on a socket file /var/run/datadog/apm/apm.socket. The application pods can then similarly mount this volume and write to this same socket. You can modify the path and socket with the features.apm.unixDomainSocketConfig.path configuration value.

After making your changes, apply the new configuration by using the following command:

kubectl apply -n $DD_NAMESPACE -f datadog-agent.yaml

Note: On minikube, you may receive an Unable to detect the kubelet URL automatically error. In this case, set global.kubelet.tlsVerify to false.

If you used Helm to install the Datadog Agent, APM is enabled by default over UDS or Windows named pipe.

To verify, ensure that datadog.apm.socketEnabled is set to true in your values.yaml.

datadog:
  apm:
    socketEnabled: true    

The default configuration creates a directory on the host and mounts it within the Agent. The Agent then creates and listens on a socket file /var/run/datadog/apm.socket. The application pods can then similarly mount this volume and write to this same socket. You can modify the path and socket with the datadog.apm.hostSocketPath and datadog.apm.socketPath configuration values.

datadog:
  apm:
    # the following values are default:
    socketEnabled: true
    hostSocketPath: /var/run/datadog/
    socketPath: /var/run/datadog/apm.socket

To disable APM, set datadog.apm.socketEnabled to false.

After making your changes, upgrade your Datadog Helm chart using the following command:

helm upgrade -f values.yaml <RELEASE NAME> datadog/datadog

If you did not set your operating system in values.yaml, add --set targetSystem=linux or --set targetSystem=windows to this command.

Note: On minikube, you may receive an Unable to detect the kubelet URL automatically error. In this case, set datadog.kubelet.tlsVerify to false.

Configure your application pods to submit traces to Datadog Agent

The Datadog Admission Controller is a component of the Datadog Cluster Agent that simplifies your application pod configuration. Learn more by reading the Datadog Admission Controller documentation.

Use the Datadog Admission Controller to inject environment variables and mount the necessary volumes on new application pods, automatically configuring pod and Agent trace communication. Learn how to automatically configure your application to submit traces to Datadog Agent by reading the Injecting Libraries Using Admission Controller documentation.

If you are sending traces to the Agent by using UDS, mount the host directory the socket is in (that the Agent created) to the application container and specify the path to the socket with DD_TRACE_AGENT_URL:

apiVersion: apps/v1
kind: Deployment
#(...)
    spec:
      containers:
      - name: "<CONTAINER_NAME>"
        image: "<CONTAINER_IMAGE>/<TAG>"
        env:
        - name: DD_TRACE_AGENT_URL
          value: 'unix:///var/run/datadog/apm.socket'
        volumeMounts:
        - name: apmsocketpath
          mountPath: /var/run/datadog
        #(...)
      volumes:
        - hostPath:
            path: /var/run/datadog/
          name: apmsocketpath

Configure your application tracers to emit traces:

After configuring your Datadog Agent to collect traces and giving your application pods the configuration on where to send traces, install the Datadog tracer into your applications to emit the traces. Once this is done, the tracer sends the traces to the appropriate DD_TRACE_AGENT_URL endpoint.

If you are sending traces to the Agent by using TCP (<IP_ADDRESS>:8126) supply this IP address to your application pods—either automatically with the Datadog Admission Controller, or manually using the downward API to pull the host IP. The application container needs the DD_AGENT_HOST environment variable that points to status.hostIP:

apiVersion: apps/v1
kind: Deployment
#(...)
    spec:
      containers:
      - name: "<CONTAINER_NAME>"
        image: "<CONTAINER_IMAGE>/<TAG>"
        env:
          - name: DD_AGENT_HOST
            valueFrom:
              fieldRef:
                fieldPath: status.hostIP

Note: This configuration requires the Agent to be configured to accept traces over TCP

Configure your application tracers to emit traces:

After configuring your Datadog Agent to collect traces and giving your application pods the configuration on where to send traces, install the Datadog tracer into your applications to emit the traces. Once this is done, the tracer automatically sends the traces to the appropriate DD_AGENT_HOST endpoint.

Refer to the language-specific APM instrumentation docs for more examples.

Additional configuration

Configure the Datadog Agent to accept traces over TCP

Update your datadog-agent.yaml with the following:

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    credentials:
      apiKey: <DATADOG_API_KEY>

  features:
    apm:
      enabled: true
      hostPortConfig:
        enabled: true
        hostPort: 8126 # default

After making your changes, apply the new configuration by using the following command:

kubectl apply -n $DD_NAMESPACE -f datadog-agent.yaml

Warning: The hostPort parameter opens a port on your host. Make sure your firewall only allows access from your applications or trusted sources. If your network plugin doesn’t support hostPorts, add hostNetwork: true in your Agent pod specifications. This shares the network namespace of your host with the Datadog Agent. This also means that all ports opened on the container are opened on the host. If a port is used both on the host and in your container, they conflict (since they share the same network namespace) and the pod does not start. Some Kubernetes installations do not allow this.

Update your values.yaml file with the following APM configuration:

datadog:
  apm:
    portEnabled: true
    port: 8126 # default

After making your changes, upgrade your Datadog Helm chart using the following command:

helm upgrade -f values.yaml <RELEASE NAME> datadog/datadog

If you did not set your operating system in values.yaml, add --set targetSystem=linux or --set targetSystem=windows to this command.

Warning: The datadog.apm.portEnabled parameter opens a port on your host. Make sure your firewall only allows access from your applications or trusted sources. If your network plugin doesn’t support hostPorts, add hostNetwork: true in your Agent pod specifications. This shares the network namespace of your host with the Datadog Agent. This also means that all ports opened on the container are opened on the host. If a port is used both on the host and in your container, they conflict (since they share the same network namespace) and the pod does not start. Some Kubernetes installations do not allow this.

Agent environment variables

Note: As a best practice, Datadog recommends using unified service tagging when assigning tags. Unified service tagging ties Datadog telemetry together through the use of three standard tags: env, service, and version. To learn how to configure your environment with unified tagging, refer to the dedicated unified service tagging documentation.

List of all environment variables available for tracing within the Agent running in Kubernetes:

Environment variableDescription
DD_API_KEYDatadog API Key
DD_PROXY_HTTPSSet up the URL for the proxy to use.
DD_APM_REPLACE_TAGSScrub sensitive data from your span’s tags.
DD_HOSTNAMEManually set the hostname to use for metrics if autodetection fails, or when running the Datadog Cluster Agent.
DD_DOGSTATSD_PORTSet the DogStatsD port.
DD_APM_RECEIVER_SOCKETCollect your traces through a Unix Domain Sockets and takes priority over hostname and port configuration if set. Off by default, when set it must point to a valid sock file.
DD_BIND_HOSTSet the StatsD & receiver hostname.
DD_LOG_LEVELSet the logging level. (trace/debug/info/warn/error/critical/off)
DD_APM_ENABLEDWhen set to true, the Datadog Agent accepts trace metrics. Default value is true (Agent 7.18+)
DD_APM_CONNECTION_LIMITSets the maximum connection limit for a 30 second time window.
DD_APM_DD_URLSet the Datadog API endpoint where your traces are sent: https://trace.agent.. Defaults to https://trace.agent.datadoghq.com.
DD_APM_RECEIVER_PORTPort that the Datadog Agent’s trace receiver listens on. Default value is 8126.
DD_APM_NON_LOCAL_TRAFFICAllow non-local traffic when tracing from other containers. Default value is true (Agent 7.18+)
DD_APM_IGNORE_RESOURCESConfigure resources for the Agent to ignore. Format should be comma separated, regular expressions. Like GET /ignore-me,(GET|POST) /and-also-me.
DD_ENVSets the global env for all data emitted by the Agent. If env is not present in your trace data, this variable is used. See APM environment setup for more details.

Operator environment variables

Environment variableDescription
agent.apm.enabledEnable this to enable APM and tracing, on port 8126. See the Datadog Docker documentation.
agent.apm.envThe Datadog Agent supports many environment variables.
agent.apm.hostPortNumber of port to expose on the host. If specified, this must be a valid port number, 0 < x < 65536. If HostNetwork is specified, this must match ContainerPort. Most containers do not need this.
agent.apm.resources.limitsLimits describes the maximum amount of compute resources allowed. For more info, see the Kubernetes documentation.
agent.apm.resources.requestsRequests describes the minimum amount of compute resources required. If requests is omitted for a container, it defaults to limits if that is explicitly specified, otherwise to an implementation-defined value. For more info, see the Kubernetes documentation.

Further Reading