Tracing a Proxy

Tracing a Proxy

You can set up tracing to include collecting trace information about proxies.

Datadog APM is included in Envoy v1.9.0 and newer.

Enabling Datadog APM

Note: The example configuration below is for Envoy v1.19. Example configurations for other versions can be found in the dd-opentracing-cpp GitHub repo.

The following settings are required to enable Datadog APM in Envoy:

  • a cluster for submitting traces to the Datadog Agent
  • http_connection_manager configuration to activate tracing
  1. Add a cluster for submitting traces to the Datadog Agent:

     clusters:
     ... existing cluster configs ...
     - name: datadog_agent
       connect_timeout: 1s
       type: strict_dns
       lb_policy: round_robin
       load_assignment:
         cluster_name: datadog_agent
         endpoints:
         - lb_endpoints:
           - endpoint:
               address:
                 socket_address:
                   address: localhost
                   port_value: 8126
    

    Change the address value if Envoy is running in a container or orchestrated environment.

  2. Include the following additional configuration in the http_connection_manager sections to enable tracing:

     - name: envoy.filters.network.http_connection_manager
       typed_config:
         "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
         generate_request_id: true
         request_id_extension:
           typed_config:
             "@type": type.googleapis.com/envoy.extensions.request_id.uuid.v3.UuidRequestIdConfig
             use_request_id_for_trace_sampling: false
         tracing: 
           provider:
             name: envoy.tracers.datadog
             typed_config:
               "@type": type.googleapis.com/envoy.config.trace.v3.DatadogConfig
               collector_cluster: datadog_agent
               service_name: envoy-v1.19 
    

    The collector_cluster value must match the name provided for the Datadog Agent cluster. The service_name can be changed to a meaningful value for your usage of Envoy.

With this configuration, HTTP requests to Envoy initiate and propagate Datadog traces, and appear in the APM UI.

Example Envoy v1.19 configuration

The following example configuration demonstrates the placement of items required to enable tracing using Datadog APM.

static_resources:
  listeners:
  - address:
      socket_address:
        address: 0.0.0.0
        port_value: 80
    traffic_direction: OUTBOUND
    filter_chains:
    - filters:
      - name: envoy.filters.network.http_connection_manager
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
          generate_request_id: true
          request_id_extension:
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.request_id.uuid.v3.UuidRequestIdConfig
              use_request_id_for_trace_sampling: false
          tracing:
          # Use the datadog tracer
            provider:
              name: envoy.tracers.datadog
              typed_config:
                "@type": type.googleapis.com/envoy.config.trace.v3.DatadogConfig
                collector_cluster: datadog_agent   # matched against the named cluster
                service_name: envoy-v1.19          # user-defined service name
          codec_type: auto
          stat_prefix: ingress_http
          route_config:
            name: local_route
            virtual_hosts:
            - name: backend
              domains:
              - "*"
              routes:
              - match:
                  prefix: "/"
                route:
                  cluster: service1
          # Traces for healthcheck requests should not be sampled.
          http_filters:
          - name: envoy.filters.http.health_check
            typed_config:
              "@type": type.googleapis.com/envoy.extensions.filters.http.health_check.v3.HealthCheck
              pass_through_mode: false
              headers:
                - exact_match: /healthcheck
                  name: :path
          - name: envoy.filters.http.router
            typed_config: {}
          use_remote_address: true
  clusters:
  - name: service1
    connect_timeout: 0.250s
    type: strict_dns
    lb_policy: round_robin
    load_assignment:
      cluster_name: service1
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: service1
                port_value: 80
  # Configure this cluster with the address of the datadog Agent
  # for sending traces.
  - name: datadog_agent
    connect_timeout: 1s
    type: strict_dns
    lb_policy: round_robin
    load_assignment:
      cluster_name: datadog_agent
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: localhost
                port_value: 8126

admin:
  access_log_path: "/dev/null"
  address:
    socket_address:
      address: 0.0.0.0
      port_value: 8001

Excluding metrics

If you are using Envoy’s dog_statsd configuration to report metrics, you can exclude activity from the datadog_agent cluster with this additional configuration.

stats_config:
  stats_matcher:
    exclusion_list:
      patterns:
      - prefix: "cluster.datadog_agent."

Environment variables

The available environment variables depend on the version of the C++ tracer embedded in Envoy.

Note: The variables DD_AGENT_HOST, DD_TRACE_AGENT_PORT and DD_TRACE_AGENT_URL do not apply to Envoy, as the address of the Datadog Agent is configured using the cluster settings.

Envoy Version C++ Tracer Version
v1.19 v1.2.1
v1.18 v1.2.1
v1.17 v1.1.5
v1.16 v1.1.5
v1.15 v1.1.5
v1.14 v1.1.3
v1.13 v1.1.1
v1.12 v1.1.1
v1.11 v0.4.2
v1.10 v0.4.2
v1.9 v0.3.6

Support for Datadog APM is available for NGINX using a combination of plugins and configurations. The instructions below use NGINX from the official Linux repositories and pre-built binaries for the plugins.

NGINX open source

Plugin installation

Note: this plugin does not work on Linux distributions that use older versions of libstdc++. This includes RHEL/Centos 7 and AmazonLinux 1. A workaround for this is to run NGINX from a Docker container. An example Dockerfile is available here.

The following plugins must be installed:

Commands to download and install these modules:

# Gets the latest release version number from Github.
get_latest_release() {
  wget -qO- "https://api.github.com/repos/$1/releases/latest" |
    grep '"tag_name":' |
    sed -E 's/.*"([^"]+)".*/\1/';
}
NGINX_VERSION=1.17.3
OPENTRACING_NGINX_VERSION="$(get_latest_release opentracing-contrib/nginx-opentracing)"
DD_OPENTRACING_CPP_VERSION="$(get_latest_release DataDog/dd-opentracing-cpp)"
# Install NGINX plugin for OpenTracing
wget https://github.com/opentracing-contrib/nginx-opentracing/releases/download/${OPENTRACING_NGINX_VERSION}/linux-amd64-nginx-${NGINX_VERSION}-ot16-ngx_http_module.so.tgz
tar zxf linux-amd64-nginx-${NGINX_VERSION}-ot16-ngx_http_module.so.tgz -C /usr/lib/nginx/modules
# Install Datadog Opentracing C++ Plugin
wget https://github.com/DataDog/dd-opentracing-cpp/releases/download/${DD_OPENTRACING_CPP_VERSION}/linux-amd64-libdd_opentracing_plugin.so.gz
gunzip linux-amd64-libdd_opentracing_plugin.so.gz -c > /usr/local/lib/libdd_opentracing_plugin.so

NGINX configuration

The NGINX configuration must load the OpenTracing module.

# Load OpenTracing module
load_module modules/ngx_http_opentracing_module.so;

The http block enables the OpenTracing module and loads the Datadog tracer:

    opentracing on; # Enable OpenTracing
    opentracing_tag http_user_agent $http_user_agent; # Add a tag to each trace!
    opentracing_trace_locations off; # Emit only one span per request.

    # Load the Datadog tracing implementation, and the given config file.
    opentracing_load_tracer /usr/local/lib/libdd_opentracing_plugin.so /etc/nginx/dd-config.json;

The log_format with_trace_id block is for correlating logs and traces. See the example NGINX config file for the complete format. The $opentracing_context_x_datadog_trace_id value captures the trace ID, and $opentracing_context_x_datadog_parent_id captures the span ID.

The location block within the server where tracing is desired should add the following:

            opentracing_operation_name "$request_method $uri";
            opentracing_propagate_context;

A config file for the Datadog tracing implementation is also required:

{
  "environment": "prod",
  "service": "nginx",
  "operation_name_override": "nginx.handle",
  "agent_host": "localhost",
  "agent_port": 8126
}

The service value can be modified to a meaningful value for your usage of NGINX. The agent_host value may need to be changed if NGINX is running in a container or orchestrated environment.

Complete examples:

After completing this configuration, HTTP requests to NGINX will initiate and propagate Datadog traces, and will appear in the APM UI.

NGINX and FastCGI

When the location is serving a FastCGI backend instead of HTTP, the location block should use opentracing_fastcgi_propagate_context instead of opentracing_propagate_context.

NGINX Ingress Controller for Kubernetes

The Kubernetes ingress-nginx controller versions 0.23.0+ include the NGINX plugin for OpenTracing.

To enable this plugin, create or edit a ConfigMap to set enable-opentracing: "true" and the datadog-collector-host to which traces should be sent. The name of the ConfigMap will be cited explicitly by the nginx-ingress controller container’s command line argument, defaulting to --configmap=$(POD_NAMESPACE)/nginx-configuration. If ingress-nginx was installed via helm chart, this ConfigMap will be named like Release-Name-nginx-ingress-controller.

The ingress controller manages both the nginx.conf and /etc/nginx/opentracing.json files. Tracing is enabled for all location blocks.

kind: ConfigMap
apiVersion: v1
metadata:
  name: nginx-configuration
  namespace: ingress-nginx
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
data:
  enable-opentracing: "true"
  datadog-collector-host: $HOST_IP
  # Defaults
  # datadog-service-name: "nginx"
  # datadog-collector-port: "8126"
  # datadog-operation-name-override: "nginx.handle"

Additionally, ensure that your nginx-ingress controller’s pod spec has the HOST_IP environment variable set. Add this entry to the env: block that contains the environment variables POD_NAME and POD_NAMESPACE.

- name: HOST_IP
  valueFrom:
    fieldRef:
      fieldPath: status.hostIP

To set a different service name per Ingress using annotations:

  nginx.ingress.kubernetes.io/configuration-snippet: |
            opentracing_tag "service.name" "custom-service-name";

The above overrides the default nginx-ingress-controller.ingress-nginx service name.

Datadog monitors every aspect of your Istio environment, so you can:

  • Drill into distributed traces for applications transacting over the mesh with APM (see below).
  • Assess the health of Envoy and the Istio control plane with logs.
  • Break down the performance of your service mesh with request, bandwidth, and resource consumption metrics.
  • Map network communication between containers, pods, and services over the mesh with Network Performance Monitoring.

To learn more about monitoring your Istio environment with Datadog, see the Istio blog.

Configuration

Datadog APM is available for Istio v1.1.3+ on Kubernetes clusters.

Datadog Agent installation

  1. Install the Agent
  2. Make sure APM is enabled for your Agent.
  3. Uncomment the hostPort setting so that Istio sidecars can connect to the Agent and submit traces.

Istio configuration and installation

To enable Datadog APM, a custom Istio installation is required to set two extra options when installing Istio.

  • --set values.global.proxy.tracer=datadog
  • --set values.pilot.traceSampling=100.0
istioctl manifest apply --set values.global.proxy.tracer=datadog --set values.pilot.traceSampling=100.0

Traces are generated when the namespace for the pod has sidecar injection enabled. This is done by adding the istio-injection=enabled label.

kubectl label namespace example-ns istio-injection=enabled

Traces are generated when Istio is able to determine the traffic is using an HTTP-based protocol. By default, Istio tries to automatically detect this. It can be manually configured by naming the ports in your application’s deployment and service. More information can be found in Istio’s documentation for Protocol Selection

By default, the service name used when creating traces is generated from the deployment name and namespace. This can be set manually by adding an app label to the deployment’s pod template:

template:
  metadata:
    labels:
      app: <SERVICE_NAME>

For CronJobs, the app label should be added to the job template, as the generated name comes from the Job instead of the higher-level CronJob.

Environment variables

Environment variables for Istio sidecars can be set on a per-deployment basis using the apm.datadoghq.com/env annotation.

    metadata:
      annotations:
        apm.datadoghq.com/env: '{ "DD_ENV": "prod", "DD_TRACE_ANALYTICS_ENABLED": "true" }'

The available environment variables depend on the version of the C++ tracer embedded in the Istio sidecar’s proxy.

Istio Version C++ Tracer Version
v1.7.x v1.1.5
v1.6.x v1.1.3
v1.5.x v1.1.1
v1.4.x v1.1.1
v1.3.x v1.1.1
v1.2.x v0.4.2
v1.1.3 v0.4.2

Deployment and service

If the Agents on your cluster are running as a deployment and service instead of the default DaemonSet, then an additional option is required to specify the DNS address and port of the Agent. For a service named datadog-agent in the default namespace, that address would be datadog-agent.default.svc.cluster.local:8126.

  • --set values.global.tracer.datadog.address=datadog-agent.default:8126

If Mutual TLS is enabled for the cluster, then the Agent’s deployment should disable sidecar injection, and you should add a traffic policy that disables TLS.

This annotation is added to the Agent’s Deployment template.

  template:
    metadata:
      annotations:
        sidecar.istio.io/inject: "false"

For Istio v1.4.x, the traffic policy can be configured using a DestinationRule. Istio v1.5.x and higher do not need an additional traffic policy.

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: datadog-agent
  namespace: istio-system
spec:
  host: datadog-agent.default.svc.cluster.local
  trafficPolicy:
    tls:
      mode: DISABLE

Automatic Protocol Selection may determine that traffic between the sidecar and Agent is HTTP, and enable tracing. This can be disabled using manual protocol selection for this specific service. The port name in the datadog-agent Service can be changed to tcp-traceport. If using Kubernetes 1.18+, appProtocol: tcp can be added to the port specification.

Further Reading