Amazon EKS on AWS Fargate

Supported OS Linux Mac OS Windows

Integration version6.1.0

Overview

This page describes the EKS Fargate integration. For ECS Fargate, see the documentation for Datadog's ECS Fargate integration.

Amazon EKS Fargate is a managed Kubernetes service that automates certain aspects of deployment and maintenance for any standard Kubernetes environment. The EKS Fargate nodes are managed by AWS Fargate and abstracted away from the user.

How Datadog monitors EKS Fargate pods

EKS Fargate pods do not run on traditional EKS nodes backed by EC2 instances. While the Agent does report system checks, like system.cpu.* and system.memory.*, these are just for the Agent container. To collect data from your EKS Fargate pods, run the Agent as a sidecar within each of your desired application pods. Each pod needs a custom RBAC that grants permissions to the kubelet for the Agent to get the required information.

The Agent sidecar is responsible for monitoring the other containers in the same pod as itself, in addition to communicating with the Cluster Agent for portions of its reporting. The Agent can:

  • Report Kubernetes metrics collection from the pod running your application containers and the Agent
  • Run Autodiscovery-based Agent integrations against the containers in the same pod
  • Collect APM and DogStatsD metrics for containers in the same pod

If you have a mixed cluster of traditional EKS nodes and Fargate pods, you can manage the EKS nodes with the standard Datadog Kubernetes installation (Helm chart or Datadog Operator) - and manage the Fargate pods separately.

Note: Cloud Network Monitoring (CNM) is not supported for EKS Fargate.

Setup

Prerequisites

AWS Fargate profile

Create and specify an AWS Fargate profile for your EKS Fargate pods.

If you do not specify an AWS Fargate profile, your pods use classical EC2 machines. To monitor these pods, use the standard Datadog Kubernetes installation with the Datadog-Amazon EKS integration.

Secret for keys and tokens

Create a Kubernetes Secret named datadog-secret that contains:

  • Your Datadog API key
  • A 32-character alphanumeric token for the Cluster Agent. The Agent and Cluster Agent use this token to communicate. Creating it in advance ensures both the traditional setups and Fargate pod get the same token value (as opposed to letting the Datadog Operator or Helm create a random token for you). For more information on how this token is used, see the Cluster Agent Setup.
kubectl create secret generic datadog-secret -n <NAMESPACE> \
  --from-literal api-key=<DATADOG_API_KEY> \
  --from-literal token=<CLUSTER_AGENT_TOKEN>

If you are deploying your traditional Datadog installation in one namespace and the Fargate pods in a different namespace, create a secret in both namespaces:

# Create the secret in the namespace:datadog-agent
kubectl create secret generic datadog-secret -n datadog-agent \
  --from-literal api-key=<DATADOG_API_KEY> \
  --from-literal token=<CLUSTER_AGENT_TOKEN>

# Create the secret in the namespace:fargate
kubectl create secret generic datadog-secret -n fargate \
  --from-literal api-key=<DATADOG_API_KEY> \
  --from-literal token=<CLUSTER_AGENT_TOKEN>

Note: To use the Admission Controller to run the Datadog Agent in Fargate, the name of this Kubernetes Secret must be datadog-secret.

AWS Fargate RBAC

Create a ClusterRole for the necessary permissions and bind it to the ServiceAccount your pods are using:

  1. Create a ClusterRole using the following manifest:

    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRole
    metadata:
      name: datadog-agent-fargate
    rules:
      - apiGroups:
        - ""
        resources:
        - nodes
        - namespaces
        - endpoints
        verbs:
        - get
        - list
      - apiGroups:
          - ""
        resources:
          - nodes/metrics
          - nodes/spec
          - nodes/stats
          - nodes/proxy
          - nodes/pods
          - nodes/healthz
        verbs:
          - get
    
  2. Create a ClusterRoleBinding to attach this to the namespaced ServiceAccount that your pods are currently using. The ClusterRoleBindings below reference this previously created ClusterRole.

    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: datadog-agent-fargate
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: datadog-agent-fargate
    subjects:
      - kind: ServiceAccount
        name: <SERVICE_ACCOUNT>
        namespace: <NAMESPACE>
    

    If your pods do not use a ServiceAccount

    If your pods do not use a ServiceAccount, use the following:

    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: datadog-agent-fargate
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: datadog-agent-fargate
    subjects:
      - kind: ServiceAccount
        name: datadog-agent
        namespace: fargate
    ---
    apiVersion: v1
    kind: ServiceAccount
    metadata:
      name: datadog-agent
      namespace: fargate
    

    This creates a ServiceAccount named datadog-agent in the fargate namespace that is referenced in the ClusterRoleBinding. Adjust this for your Fargate pods’ namespace and set this as the serviceAccountName in your pod spec.

    If you are using multiple ServiceAccounts across namespaces

    If you are using multiple ServiceAccounts across namespaces for your Fargate pods, use the following:

    apiVersion: rbac.authorization.k8s.io/v1
    kind: ClusterRoleBinding
    metadata:
      name: datadog-agent-fargate
    roleRef:
      apiGroup: rbac.authorization.k8s.io
      kind: ClusterRole
      name: datadog-agent-fargate
    subjects:
      - kind: ServiceAccount
        name: <SERVICE_ACCOUNT_1>
        namespace: <NAMESPACE_1>
      - kind: ServiceAccount
        name: <SERVICE_ACCOUNT_2>
        namespace: <NAMESPACE_2>
      - kind: ServiceAccount
        name: <SERVICE_ACCOUNT_3>
        namespace: <NAMESPACE_3>
    

To validate your RBAC, see Troubleshooting: ServiceAccount Permissions.

Installation

After you complete all prerequisites, run the Datadog Agent as a sidecar container within each of your pods. You can do this with the Datadog Admission Controller’s automatic injection feature, or manually.

The Admission Controller is a Datadog component that can automatically add the Agent sidecar to every pod that has the label agent.datadoghq.com/sidecar: fargate.

Manual configuration requires that you modify every workload manifest when adding or changing the Agent sidecar. Datadog recommends that you use the Admission Controller instead.

Note: If you have a mixed cluster of traditional EKS nodes and Fargate pods, set up up monitoring for your traditional nodes with the standard Datadog Kubernetes installation (with the Kubernetes Secret from the prerequisites) and install the Datadog-AWS integration and Datadog-EKS integration. Then, to monitor your Fargate pods, continue with this section.

Admission Controller - Datadog Operator

  1. If you haven’t already, install Helm on your machine.

  2. Install the Datadog Operator:

    helm repo add datadog https://helm.datadoghq.com
    helm install datadog-operator datadog/datadog-operator
    
  3. Create a datadog-agent.yaml file to define a DatadogAgent custom resource, with Admission Controller and Fargate injection enabled:

    apiVersion: datadoghq.com/v2alpha1
    kind: DatadogAgent
    metadata:
      name: datadog
    spec:
      global:
        clusterName: <CLUSTER_NAME>
        clusterAgentTokenSecret:
          secretName: datadog-secret
          keyName: token
        credentials:
          apiSecret:
            secretName: datadog-secret
            keyName: api-key
      features:
        admissionController:
          agentSidecarInjection:
            enabled: true
            provider: fargate
    
  4. Apply this configuration:

    kubectl apply -n <NAMESPACE> -f datadog-agent.yaml
    
  5. After the Cluster Agent reaches a running state and registers Admission Controller’s mutating webhooks, add the label agent.datadoghq.com/sidecar: fargate to your desired pods (not the parent workload) to trigger the injection of the Datadog Agent sidecar container.

Note: The Admission Controller only mutates new pods, not pods that are already created. It does not adjust your serviceAccountName. If you have not set the RBAC for this pod, the Agent cannot connect to Kubernetes.

Example

The following is output from a sample Redis deployment’s pod where the Admission Controller injected an Agent sidecar. The environment variables and resource settings are automatically applied based on the Datadog Fargate profile’s internal default values.

The sidecar uses the image repository and tags set in datadog-agent.yaml.

metadata:
  labels:
    app: redis
    eks.amazonaws.com/fargate-profile: fp-fargate
    agent.datadoghq.com/sidecar: fargate
spec:
  serviceAccountName: datadog-agent
  containers:
  - name: my-redis
    image: redis:latest
    args:
      - redis-server
    # (...)

  - name: datadog-agent-injected
    image: gcr.io/datadoghq/agent:7.64.0
    env:
      - name: DD_API_KEY
        valueFrom:
          secretKeyRef:
            key: api-key
            name: datadog-secret
      - name: DD_EKS_FARGATE
        value: "true"
      - name: DD_CLUSTER_AGENT_AUTH_TOKEN
        valueFrom:
          secretKeyRef:
            key: token
            name: datadog-secret
      # (...)
    resources:
      limits:
        cpu: 200m
        memory: 256Mi
      requests:
        cpu: 200m
        memory: 256Mi
Custom configuration with sidecar profiles and custom selectors - Datadog Operator

To further configure the Agent or its container resources, use the following properties in your DatadogAgent resource:

  • spec.features.admissionController.agentSidecarInjection.profiles, to add environment variable definitions and resource settings
  • spec.features.admissionController.agentSidecarInjection.selectors, to configure a custom selector to target your desired workload pods (instead of pods with the agent.datadoghq.com/sidecar: fargate label)

You can adjust the profile of the injected Agent container without updating the label selector if desired.

For example, the following datadog-agent.yaml uses a selector to target all pods with the label app: redis. The sidecar profile configures a DD_PROCESS_CONFIG_PROCESS_COLLECTION_ENABLED environment variable and new resource settings.

#(...)
spec:
  #(...)
  features:
    admissionController:
      agentSidecarInjection:
        enabled: true
        provider: fargate
        selectors:
          - objectSelector:
              matchLabels:
                app: redis
        profiles:
          - env:
            - name: DD_PROCESS_CONFIG_PROCESS_COLLECTION_ENABLED
              value: "true"
            resources:
              requests:
                cpu: "400m"
                memory: "256Mi"
              limits:
                cpu: "800m"
                memory: "512Mi"

Apply this configuration and wait for the Cluster Agent to reach a running state and register Admission Controller mutating webhooks. Then, an Agent sidecar is automatically injected into any new pod created with the label app: redis.

Note: The Admission Controller does not mutate pods that are already created.

The following is output from a Redis deployment’s pod where the Admission Controller injected an Agent sidecar based on the pod label app: redis instead of the label agent.datadoghq.com/sidecar: fargate:

metadata: 
  labels:
    app: redis
    eks.amazonaws.com/fargate-profile: fp-fargate
spec:
  serviceAccountName: datadog-agent
  containers:
  - name: my-redis
    image: redis:latest
    args:
    - redis-server
    # (...)

  - name: datadog-agent-injected
    image: gcr.io/datadoghq/agent:7.64.0
    env:
      - name: DD_API_KEY
        valueFrom:
          secretKeyRef:
            key: api-key
            name: datadog-secret
      - name: DD_EKS_FARGATE
        value: "true"
      - name: DD_CLUSTER_AGENT_AUTH_TOKEN
        valueFrom:
          secretKeyRef:
            key: token
            name: datadog-secret
      - name: DD_PROCESS_CONFIG_PROCESS_COLLECTION_ENABLED
        value: "true"
      #(...)
    resources:
      requests:
        cpu: "400m"
        memory: "256Mi"
      limits:
        cpu: "800m"
        memory: "512Mi"

The environment variables and resource settings are automatically applied based on the new Fargate profile configured in the DatadogAgent.

Admission Controller - Helm

  1. If you haven’t already, install Helm on your machine.

  2. Add the Datadog Helm repository:

    helm repo add datadog https://helm.datadoghq.com
    helm repo update
    
  3. Create a datadog-values.yaml with Admission Controller and Fargate injection enabled:

    datadog:
      apiKeyExistingSecret: datadog-secret
      clusterName: <CLUSTER_NAME>
    clusterAgent:
      tokenExistingSecret: datadog-secret
      admissionController:
        agentSidecarInjection:
          enabled: true
          provider: fargate
    
  4. Deploy the chart in your desired namespace:

    helm install datadog-agent -f datadog-values.yaml datadog/datadog
    
  5. After the Cluster Agent reaches a running state and registers Admission Controller’s mutating webhooks, add the label agent.datadoghq.com/sidecar: fargate to your desired pods (not the parent workload) to trigger the injection of the Datadog Agent sidecar container.

Note: The Admission Controller only mutates new pods, not pods that are already created. It does not adjust your serviceAccountName. If you have not set the RBAC for this pod, the Agent cannot connect to Kubernetes.

On a Fargate-only cluster, you can set agents.enabled=false to skip creating the traditional DaemonSet for monitoring workloads on EC2 instances.

Example

The following is output from a sample Redis Deployment’s pod where the Admission Controller injected an Agent sidecar. The environment variables and resource settings are automatically applied based on the Datadog Fargate profile’s internal default values.

The sidecar uses the image repository and tags set in datadog-values.yaml.

metadata:
  labels:
    app: redis
    eks.amazonaws.com/fargate-profile: fp-fargate
    agent.datadoghq.com/sidecar: fargate
spec:
  serviceAccountName: datadog-agent
  containers:
  - name: my-redis
    image: redis:latest
    args:
      - redis-server
    # (...)

  - name: datadog-agent-injected
    image: gcr.io/datadoghq/agent:7.64.0
    env:
      - name: DD_API_KEY
        valueFrom:
          secretKeyRef:
            key: api-key
            name: datadog-secret
      - name: DD_EKS_FARGATE
        value: "true"
      - name: DD_CLUSTER_AGENT_AUTH_TOKEN
        valueFrom:
          secretKeyRef:
            key: token
            name: datadog-secret
      # (...)
    resources:
      limits:
        cpu: 200m
        memory: 256Mi
      requests:
        cpu: 200m
        memory: 256Mi
 
Custom configuration with sidecar profiles and custom selectors - Helm

To further configure the Agent or its container resources, use the following properties in your Helm configuration:

  • clusterAgent.admissionController.agentSidecarInjection.profiles, to add environment variable definitions and resource settings
  • clusterAgent.admissionController.agentSidecarInjection.selectors, to configure a custom selector to target your desired workload pods (instead of pods with the agent.datadoghq.com/sidecar: fargate label)

You can adjust the profile of the injected Agent container without updating the label selector if desired.

For example, the following datadog-values.yaml uses a selector to target all pods with the label app: redis. The sidecar profile configures a DD_PROCESS_CONFIG_PROCESS_COLLECTION_ENABLED environment variable and new resource settings.

#(...)
clusterAgent:
  admissionController:
    agentSidecarInjection:
      enabled: true
      provider: fargate
      selectors:
        - objectSelector:
            matchLabels:
              app: redis
      profiles:
        - env:
          - name: DD_PROCESS_CONFIG_PROCESS_COLLECTION_ENABLED
            value: "true"
          resources:
            requests:
              cpu: "400m"
              memory: "256Mi"
            limits:
              cpu: "800m"
              memory: "512Mi"

Apply this configuration and wait for the Cluster Agent to reach a running state and register Admission Controller mutating webhooks. Then, an Agent sidecar is automatically injected into any new pod created with the label app: redis.

Note: The Admission Controller does not mutate pods that are already created.

The following is output from a Redis deployment’s pod where the Admission Controller injected an Agent sidecar based on the pod label app: redis instead of the label agent.datadoghq.com/sidecar: fargate:

metadata:
  labels:
    app: redis
    eks.amazonaws.com/fargate-profile: fp-fargate
spec:
  serviceAccountName: datadog-agent
  containers:
  - name: my-redis
    image: redis:latest
    args:
    - redis-server
    # (...)

  - name: datadog-agent-injected
    image: gcr.io/datadoghq/agent:7.64.0
    env:
      - name: DD_API_KEY
        valueFrom:
          secretKeyRef:
            key: api-key
            name: datadog-secret
      - name: DD_EKS_FARGATE
        value: "true"
      - name: DD_CLUSTER_AGENT_AUTH_TOKEN
        valueFrom:
          secretKeyRef:
            key: token
            name: datadog-secret
      - name: DD_PROCESS_CONFIG_PROCESS_COLLECTION_ENABLED
        value: "true"
      #(...)
    resources:
      requests:
        cpu: "400m"
        memory: "256Mi"
      limits:
        cpu: "800m"
        memory: "512Mi"

The environment variables and resource settings are automatically applied based on the new Fargate profile configured in the Helm configuration.

Manual

To start collecting data from your Fargate type pod, deploy the Datadog Agent v7.17+ as a sidecar container within your application’s pod using the following manifest:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: "<APPLICATION_NAME>"
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: "<APPLICATION_NAME>"
  template:
    metadata:
      labels:
        app: "<APPLICATION_NAME>"
    spec:
      serviceAccountName: datadog-agent
      containers:
        # Your original container
        - name: "<CONTAINER_NAME>"
          image: "<CONTAINER_IMAGE>"

        # Running the Agent as a side-car
        - name: datadog-agent
          image: gcr.io/datadoghq/agent:7
          env:
            - name: DD_API_KEY
              valueFrom:
                secretKeyRef:
                  key: api-key
                  name: datadog-secret
            - name: DD_SITE
              value: "<DATADOG_SITE>"
            - name: DD_EKS_FARGATE
              value: "true"
            - name: DD_CLUSTER_NAME
              value: "<CLUSTER_NAME>"
            - name: DD_KUBERNETES_KUBELET_NODENAME
              valueFrom:
                fieldRef:
                  apiVersion: v1
                  fieldPath: spec.nodeName
            resources:
              requests:
                memory: "256Mi"
                cpu: "200m"
              limits:
                memory: "256Mi"
                cpu: "200m"
  • Replace <DATADOG_SITE> to your site: . Defaults to datadoghq.com.
  • Ensure you are using a serviceAccountName with the prerequisite permissions.
  • Add DD_TAGS to append additional space separated <KEY>:<VALUE> tags. The DD_CLUSTER_NAME environment variable sets your kube_cluster_name tag.

This manifest uses the Secret datadog-secret created in the prerequisite steps.

Running the Cluster Agent or the Cluster Checks Runner

Datadog recommends you run the Cluster Agent to access features such as events collection, Kubernetes resources view, and cluster checks.

When using EKS Fargate, there are two possible scenarios depending on whether or not the EKS cluster is running mixed workloads (Fargate/non-Fargate).

If the EKS cluster runs Fargate and non-Fargate workloads, and you want to monitor the non-Fargate workload through Node Agent DaemonSet, add the Cluster Agent/Cluster Checks Runner to this deployment. For more information, see the Cluster Agent Setup.

When deploying your Cluster Agent use the Secret and token created in the prerequisite steps.

Helm
datadog:
  apiKeyExistingSecret: datadog-secret
  clusterName: <CLUSTER_NAME>
clusterAgent:
  tokenExistingSecret: datadog-secret

Set agents.enabled=false to disable the standard node Agent if you are using only Fargate workloads.

Operator
apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  global:
    clusterAgentTokenSecret:
      secretName: datadog-secret
      keyName: token
    credentials:
      apiSecret:
        secretName: datadog-secret
        keyName: api-key
Configuring sidecar for Cluster Agent

In both cases, you need to change the Datadog Agent sidecar manifest in order to allow communication with the Cluster Agent:

    containers:
    #(...)
    - name: datadog-agent
      image: gcr.io/datadoghq/agent:7
      env:
        #(...)
        - name: DD_CLUSTER_NAME
          value: <CLUSTER_NAME>
        - name: DD_CLUSTER_AGENT_ENABLED
          value: "true"
        - name: DD_CLUSTER_AGENT_AUTH_TOKEN
          valueFrom:
            secretKeyRef:
              name: datadog-secret
              key: token
        - name: DD_CLUSTER_AGENT_URL
          value: https://<CLUSTER_AGENT_SERVICE_NAME>.<CLUSTER_AGENT_SERVICE_NAMESPACE>.svc.cluster.local:5005

See the DD_CLUSTER_AGENT_URL relative to the Service name and Namespace created for your Datadog Cluster Agent.

Metrics collection

Integration metrics

Use Autodiscovery annotations with your application container to start collecting its metrics for the supported Agent integrations.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: "<APPLICATION_NAME>"
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: "<APPLICATION_NAME>"
  template:
    metadata:
      labels:
        app: "<APPLICATION_NAME>"
      annotations:
        ad.datadoghq.com/<CONTAINER_NAME>.checks: |
          {
            "<INTEGRATION_NAME>": {
              "init_config": <INIT_CONFIG>,
              "instances": [<INSTANCES_CONFIG>]
            }
          }          
    spec:
      serviceAccountName: datadog-agent
      containers:
      # Your original container
      - name: "<CONTAINER_NAME>"
        image: "<CONTAINER_IMAGE>"

      # Running the Agent as a side-car
      - name: datadog-agent
        image: gcr.io/datadoghq/agent:7
        env:
          - name: DD_API_KEY
            valueFrom:
              secretKeyRef:
                key: api-key
                name: datadog-secret
          - name: DD_SITE
            value: "<DATADOG_SITE>"
          - name: DD_EKS_FARGATE
            value: "true"
          - name: DD_KUBERNETES_KUBELET_NODENAME
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: spec.nodeName
          # (...)

DogStatsD

In EKS Fargate your application container will send the DogStatsD metrics to the Datadog Agent sidecar container. The Agent accepts these metrics by default over the port 8125.

You do not have to set the DD_AGENT_HOST address in your application container when sending these metrics. Let this default to localhost.

Live containers

Datadog Agent v6.19+ supports live containers in the EKS Fargate integration. Live containers appear on the Containers page.

Kubernetes resources view

To collect Kubernetes resource views, you need a Cluster Agent setup and a valid connection between the sidecar Agent and Cluster Agent. When using the Admission Controller’s sidecar injection setup this is connected for you automatically. When configuring the sidecar manually ensure you are connecting the sidecar Agent.

Process collection

To collect all processes running on your Fargate pod:

  1. Set shareProcessNamespace: true on your pod spec. For example:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: "<APPLICATION_NAME>"
      namespace: default
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: "<APPLICATION_NAME>"
      template:
        metadata:
          labels:
            app: "<APPLICATION_NAME>"
            agent.datadoghq.com/sidecar: fargate
        spec:
          serviceAccountName: datadog-agent
          shareProcessNamespace: true
          containers:
          # Your original container
          - name: "<CONTAINER_NAME>"
            image: "<CONTAINER_IMAGE>"
    
  2. Set the Agent environment variable DD_PROCESS_CONFIG_PROCESS_COLLECTION_ENABLED=true by adding a custom sidecar profile in your Operator’s DatadogAgent configuration:

    #(...)
    spec:
      #(...)
      features:
        admissionController:
          agentSidecarInjection:
            enabled: true
            provider: fargate
            profiles:
              - env:
                - name: DD_PROCESS_CONFIG_PROCESS_COLLECTION_ENABLED
                  value: "true"
    

To collect all processes running on your Fargate pod:

  1. Set shareProcessNamespace: true on your pod spec. For example:

    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: "<APPLICATION_NAME>"
      namespace: default
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: "<APPLICATION_NAME>"
      template:
        metadata:
          labels:
            app: "<APPLICATION_NAME>"
            agent.datadoghq.com/sidecar: fargate
        spec:
          serviceAccountName: datadog-agent
          shareProcessNamespace: true
          containers:
          # Your original container
          - name: "<CONTAINER_NAME>"
            image: "<CONTAINER_IMAGE>"
    
  2. Set the Agent environment variable DD_PROCESS_CONFIG_PROCESS_COLLECTION_ENABLED=true by adding a custom sidecar profile in your Helm configuration:

    clusterAgent:
      admissionController:
        agentSidecarInjection:
          enabled: true
          provider: fargate
          profiles:
            - env:
              - name: DD_PROCESS_CONFIG_PROCESS_COLLECTION_ENABLED
                value: "true"
    

To collect all processes running on your Fargate pod, set the Agent environment variable DD_PROCESS_CONFIG_PROCESS_COLLECTION_ENABLED=true and set shareProcessNamespace: true on your pod spec.

For example:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: "<APPLICATION_NAME>"
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: "<APPLICATION_NAME>"
  template:
    metadata:
      labels:
        app: "<APPLICATION_NAME>"
    spec:
      serviceAccountName: datadog-agent
      shareProcessNamespace: true
      containers:
      # Your original container
      - name: "<CONTAINER_NAME>"
        image: "<CONTAINER_IMAGE>"

      # Running the Agent as a side-car
      - name: datadog-agent
        image: gcr.io/datadoghq/agent:7
        env:
          - name: DD_API_KEY
            valueFrom:
              secretKeyRef:
                key: api-key
                name: datadog-secret
          - name: DD_SITE
            value: "<DATADOG_SITE>"
          - name: DD_EKS_FARGATE
            value: "true"
          - name: DD_KUBERNETES_KUBELET_NODENAME
            valueFrom:
              fieldRef:
                apiVersion: v1
                fieldPath: spec.nodeName
          - name: DD_PROCESS_CONFIG_PROCESS_COLLECTION_ENABLED
            value: "true"
          # (...)

Log collection

Collecting logs from EKS on Fargate with Fluent Bit

Monitor EKS Fargate logs by using Fluent Bit to route EKS logs to CloudWatch Logs and the Datadog Forwarder to route logs to Datadog.

  1. To configure Fluent Bit to send logs to CloudWatch, create a Kubernetes ConfigMap that specifies CloudWatch Logs as its output. The ConfigMap specifies the log group, region, prefix string, and whether to automatically create the log group.

     kind: ConfigMap
     apiVersion: v1
     metadata:
       name: aws-logging
       namespace: aws-observability
     data:
       output.conf: |
         [OUTPUT]
             Name cloudwatch_logs
             Match   *
             region us-east-1
             log_group_name awslogs-https
             log_stream_prefix awslogs-firelens-example
             auto_create_group true     
    
  2. Use the Datadog Forwarder to collect logs from CloudWatch and send them to Datadog.

Trace collection

In EKS Fargate, your application container sends its traces to the Datadog Agent sidecar container. The Agent accepts these traces over port 8126 by default.

You do not have to set the DD_AGENT_HOST address in your application container when sending these metrics. Let this default to localhost.

Set shareProcessNamespace: true in the pod spec to assist the Agent for origin detection.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: "<APPLICATION_NAME>"
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: "<APPLICATION_NAME>"
  template:
    metadata:
      labels:
        app: "<APPLICATION_NAME>"
    spec:
      serviceAccountName: datadog-agent
      shareProcessNamespace: true
      containers:
      # Your original container
      - name: "<CONTAINER_NAME>"
        image: "<CONTAINER_IMAGE>"

      # Running the Agent as a side-car
      - name: datadog-agent
        image: gcr.io/datadoghq/agent:7
        # (...)

Read more about how to set up tracing.

Events collection

To collect events from your Amazon EKS Fargate API server, run the Datadog Cluster Agent within your EKS cluster. The Cluster Agent collects Kubernetes events, including those for the EKS Fargate pods, by default.

Note: You can also collect events if you run the Datadog Cluster Agent in a pod in Fargate.

Data Collected

Metrics

The eks_fargate check submits a heartbeat metric eks.fargate.pods.running that is tagged by pod_name and virtual_node so you can keep track of how many pods are running.

Service Checks

The eks_fargate check does not include any service checks.

Events

The eks_fargate check does not include any events.

Troubleshooting

ServiceAccount Kubelet permissions

Ensure you have the right permissions on the ServiceAccount associated with your pod. If your pod does not have a ServiceAccount associated with it or isn’t bound to the correct ClusterRole, it does not have access to the Kubelet.

To validate your access, run:

kubectl auth can-i get nodes/pods --as system:serviceaccount:<NAMESPACE>:<SERVICEACCOUNT>

For example, if your Fargate pod is in the fargate namespace with the ServiceAccount datadog-agent:

kubectl auth can-i get nodes/pods --as system:serviceaccount:fargate:datadog-agent

This returns yes or no based on the access.

Datadog Agent container security context

The Datadog Agent container is designed to run as the dd-agent user (UID: 100). If you override the default security context by setting, for example, runAsUser: 1000 in your pod spec, the container fails to start due to insufficient permissions. You may see errors such as:

[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
s6-chown: fatal: unable to chown /var/run/s6/etc/cont-init.d/50-ecs.sh: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/cont-init.d/50-eks.sh: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/cont-init.d/60-network-check.sh: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/cont-init.d/59-defaults.sh: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/cont-init.d/60-sysprobe-check.sh: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/cont-init.d/50-ci.sh: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/cont-init.d/89-copy-customfiles.sh: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/cont-init.d/01-check-apikey.sh: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/cont-init.d/51-docker.sh: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/cont-init.d/50-kubernetes.sh: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/cont-init.d/50-mesos.sh: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/services.d/trace/run: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/services.d/security/run: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/services.d/sysprobe/run: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/services.d/agent/run: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/services.d/process/run: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/services.d/security/finish: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/services.d/trace/finish: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/services.d/sysprobe/finish: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/services.d/agent/finish: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/services.d/process/finish: Operation not permitted
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 01-check-apikey.sh: executing... 
[cont-init.d] 01-check-apikey.sh: exited 0.
[cont-init.d] 50-ci.sh: executing... 
[cont-init.d] 50-ci.sh: exited 0.
[cont-init.d] 50-ecs.sh: executing... 
[cont-init.d] 50-ecs.sh: exited 0.
[cont-init.d] 50-eks.sh: executing... 
ln: failed to create symbolic link '/etc/datadog-agent/datadog.yaml': Permission denied
[cont-init.d] 50-eks.sh: exited 0.
[cont-init.d] 50-kubernetes.sh: executing... 
[cont-init.d] 50-kubernetes.sh: exited 0.
[cont-init.d] 50-mesos.sh: executing... 
[cont-init.d] 50-mesos.sh: exited 0.
[cont-init.d] 51-docker.sh: executing... 
[cont-init.d] 51-docker.sh: exited 0.
[cont-init.d] 59-defaults.sh: executing... 
touch: cannot touch '/etc/datadog-agent/datadog.yaml': Permission denied
[cont-init.d] 59-defaults.sh: exited 1.

Since Datadog Cluster Agent v7.62+, overriding the security context for the Datadog Agent sidecar allows you to maintain consistent security standards across your Kubernetes deployments. Whether using the DatadogAgent custom resource or Helm values, you can ensure that the Agent container runs with the appropriate user, dd-agent (UID 100), as needed by your environment.

By following the examples, you can deploy the Agent sidecar in environments where the default Pod security context must be overridden.

Datadog Operator

apiVersion: datadoghq.com/v2alpha1
kind: DatadogAgent
metadata:
  name: datadog
spec:
  features:
    admissionController:
      agentSidecarInjection:
        enabled: true
        provider: fargate
        - securityContext:
            runAsUser: 100

Helm

clusterAgent:
  admissionController:
    agentSidecarInjection:
      profiles:
        - securityContext:
            runAsUser: 100

Need help? Contact Datadog support.

Further Reading

Additional helpful documentation, links, and articles: