Amazon ECS on AWS Fargate

Docs > Integrations > Amazon ECS on AWS Fargate

Supported OS Linux Windows Mac OS

Integration version7.4.0

Overview

This page describes the ECS Fargate integration. For EKS Fargate, see the documentation for Datadog's EKS Fargate integration.

Get metrics from all your containers running in ECS Fargate:

CPU/Memory usage & limit metrics
Monitor your applications running on Fargate using Datadog integrations or custom metrics.

The Datadog Agent retrieves metrics for the task definition’s containers with the ECS task metadata endpoint. According to the ECS Documentation on that endpoint:

This endpoint returns Docker stats JSON for all of the containers associated with the task. For more information about each of the returned stats, see ContainerStats in the Docker API documentation.

The Task Metadata endpoint is only available from within the task definition itself, which is why the Datadog Agent needs to be run as an additional container within each task definition to be monitored.

To enable metric collection, set the environment variable ECS_FARGATE to "true" in the Datadog container definition.

Setup

The following steps cover setup of the Datadog Container Agent within Amazon ECS Fargate. Note: Datadog Agent version 6.1.1 or higher is needed to take full advantage of the Fargate integration.

Tasks that do not have the Datadog Agent still report metrics with Cloudwatch, however the Agent is needed for Autodiscovery, detailed container metrics, tracing, and more. Additionally, Cloudwatch metrics are less granular, and have more latency in reporting than metrics shipped directly through the Datadog Agent.

Installation

You can also monitor AWS Batch jobs on ECS Fargate. See Installation for AWS Batch.

To monitor your ECS Fargate tasks with Datadog, run the Agent as a container in same task definition as your application container. To collect metrics with Datadog, each task definition should include a Datadog Agent container in addition to the application containers. Follow these setup steps:

Create an ECS Fargate task
Create or Modify your IAM Policy
Run the task as a replica service

Create an ECS Fargate task

The primary unit of work in Fargate is the task, which is configured in the task definition. A task definition is comparable to a pod in Kubernetes. A task definition must contain one or more containers. In order to run the Datadog Agent, create your task definition to run your application container(s), as well as the Datadog Agent container.

The instructions below show you how to configure the task using the Amazon Web Console, AWS CLI tools, or AWS CloudFormation.

Web UI Task Definition

AWS CLI Task Definition

Download datadog-agent-ecs-fargate.json. Note: If you are using Internet Explorer, this may download as a gzip file, which contains the JSON file mentioned below.

Add your other application containers to the task definition. For details on collecting integration metrics, see Integration Setup for ECS Fargate.

Optionally - Add an Agent health check.

Add the following to your ECS task definition to create an Agent health check:

"healthCheck": {
  "retries": 3,
  "command": ["CMD-SHELL","agent health"],
  "timeout": 5,
  "interval": 30,
  "startPeriod": 15
}

Execute the following command to register the ECS task definition:

aws ecs register-task-definition --cli-input-json file://<PATH_TO_FILE>/datadog-agent-ecs-fargate.json

AWS CloudFormation Task Definition

You can use AWS CloudFormation templating to configure your Fargate containers. Use the AWS::ECS::TaskDefinition resource within your CloudFormation template to set the Amazon ECS task and specify FARGATE as the required launch type for that task.

Resources:
  ECSTaskDefinition:
    Type: 'AWS::ECS::TaskDefinition'
    Properties:
      NetworkMode: awsvpc
      RequiresCompatibilities:
        - FARGATE
      Cpu: 256
      Memory: 512
      ContainerDefinitions:
        - Name: datadog-agent
          Image: 'public.ecr.aws/datadog/agent:latest'
          Environment:
            - Name: DD_API_KEY
              Value: <DATADOG_API_KEY>
            - Name: ECS_FARGATE
              Value: true

Lastly, include your other application containers within the ContainerDefinitions and deploy through CloudFormation.

For more information on CloudFormation templating and syntax, see the AWS CloudFormation task definition documentation.

Datadog CDK Task Definition

You can use the Datadog CDK Constructs to configure your ECS Fargate task definition. Use the DatadogECSFargate construct to instrument your containers for desired Datadog features. This is supported in TypeScript, JavaScript, Python, and Go.

const ecsDatadog = new DatadogECSFargate({
  apiKey: <DATADOG_API_KEY>
  site: <DATADOG_SITE>
});

Then, define your task definition using FargateTaskDefinitionProps.

const fargateTaskDefinition = ecsDatadog.fargateTaskDefinition(
  this,
  <TASK_ID>,
  <FARGATE_TASK_DEFINITION_PROPS>
);

Lastly, include your other application containers by adding your ContainerDefinitionOptions.

fargateTaskDefinition.addContainer(<CONTAINER_ID>, <CONTAINER_DEFINITION_OPTIONS>);

For more information on the DatadogECSFargate construct instrumentation and syntax, see the Datadog ECS Fargate CDK documentation.

Datadog Terraform Task Definition

You can use the Datadog ECS Fargate Terraform module to configure your containers for Datadog. This Terraform module wraps the aws_ecs_task_definition resource and automatically instruments your task definition for Datadog. Pass your input arguments into the Datadog ECS Fargate Terraform module in a similiar manner as to the aws_ecs_task_definition. Make sure to include your task family and container_definitions.

module "ecs_fargate_task" {
  source  = "DataDog/ecs-datadog/aws//modules/ecs_fargate"
  version = "1.0.0"

  # Configure Datadog
  dd_api_key = <DATADOG_API_KEY>
  dd_site    = <DATADOG_SITE>
  dd_dogstatsd = {
    enabled = true,
  }
  dd_apm = {
    enabled = true,
  }

  # Configure Task Definition
  family                   = <TASK_FAMILY>
  container_definitions    = <CONTAINER_DEFINITIONS>
  cpu                      = 256
  memory                   = 512
  network_mode             = "awsvpc"
  requires_compatibilities = ["FARGATE"]
}

Lastly, include your other application containers within the ContainerDefinitions and deploy through Terraform.

For more information on the Terraform module, see the Datadog ECS Fargate Terraform documentation.

Run the task as a replica service

The only option in ECS Fargate is to run the task as a Replica Service. The Datadog Agent runs in the same task definition as your application and integration containers.

Web UI Replica Service

Log in to your AWS Web Console and navigate to the ECS section. If needed, create a cluster with the Networking only cluster template.
Choose the cluster to run the Datadog Agent on.
On the Services tab, click the Create button.
For Launch type, choose FARGATE.
For Task Definition, select the task created in the previous steps.
Enter a Service name.
For Number of tasks enter 1, then click the Next step button.
Select the Cluster VPC, Subnets, and Security Groups.
Load balancing and Service discovery are optional based on your preference.
Click the Next step button.
Auto Scaling is optional based on your preference.
Click the Next step button, then click the Create service button.

AWS CLI Replica Service

Run the following commands using the AWS CLI tools.

Note: Fargate version 1.1.0 or greater is required, so the command below specifies the platform version.

If needed, create a cluster:

aws ecs create-cluster --cluster-name "<CLUSTER_NAME>"

Run the task as a service for your cluster:

aws ecs run-task --cluster <CLUSTER_NAME> \
--network-configuration "awsvpcConfiguration={subnets=["<PRIVATE_SUBNET>"],securityGroups=["<SECURITY_GROUP>"]}" \
--task-definition arn:aws:ecs:us-east-1:<AWS_ACCOUNT_NUMBER>:task-definition/<TASK_NAME>:1 \
--region <AWS_REGION> --launch-type FARGATE --platform-version 1.4.0

AWS CloudFormation Replica Service

In the CloudFormation template you can reference the ECSTaskDefinition resource created in the previous example into the AWS::ECS::Service resource being created. After this specify your Cluster, DesiredCount, and any other parameters necessary for your application in your replica service.

Resources:
  ECSTaskDefinition:
    #(...)
  ECSService:
    Type: 'AWS::ECS::Service'
    Properties:
      Cluster: <CLUSTER_NAME>
      TaskDefinition:
        Ref: "ECSTaskDefinition"
      DesiredCount: 1
      #(...)

For more information on CloudFormation templating and syntax, see the AWS CloudFormation ECS service documentation.

AWS CDK Replica Service

In the CDK code you can reference the fargateTaskDefinition resource created in the previous example into the FargateService resource being created. After this, specify your Cluster, DesiredCount, and any other parameters necessary for your application in your replica service.

const service = new ecs.FargateService(this, <SERVICE_ID>, {
  <CLUSTER>,
  fargateTaskDefinition,
  desiredCount: 1
});

For more information on the CDK ECS service construct and syntax, see the AWS CDK ECS Service documentation.

AWS Terraform Replica Service

In the Terraform code you can reference the aws_ecs_task_definition resource created in the previous example within the aws_ecs_service resource being created. Then, specify your Cluster, DesiredCount, and any other parameters necessary for your application in your replica service.

resource "aws_ecs_service" <SERVICE_ID> {
  name            = <SERVICE_NAME>
  cluster         = <CLUSTER_ID>
  task_definition = module.ecs_fargate_task.arn
  desired_count   = 1
}

For more information on the Terraform ECS service module and syntax, see the AWS Terraform ECS service documentation.

To provide your Datadog API key as a secret, see Using secrets.

Installation for AWS Batch

To monitor your AWS Batch jobs with Datadog, see AWS Batch with ECS Fargate and the Datadog Agent

Create or modify your IAM policy

Add the following permissions to your Datadog IAM policy to collect ECS Fargate metrics. For more information, see the ECS policies on the AWS website.

AWS Permission	Description
`ecs:ListClusters`	List available clusters.
`ecs:ListContainerInstances`	List instances of a cluster.
`ecs:DescribeContainerInstances`	Describe instances to add metrics on resources and tasks running.

Using secrets

As an alternative to populating the DD_API_KEY environment variable with your API key in plaintext, you can instead reference the ARN of a plaintext secret stored in AWS Secrets Manager. Place the DD_API_KEY environment variable under the containerDefinitions.secrets section of the task or job definition file. Ensure that the task/job execution role has the necessary permission to fetch secrets from AWS Secrets Manager.

Metric collection

After the Datadog Agent is setup as described above, the ecs_fargate check collects metrics with autodiscovery enabled. Add Docker labels to your other containers in the same task to collect additional metrics.

Although the integration works on Linux and Windows, some metrics are OS dependent. All metrics exposed when running on Windows are also exposed on Linux, but there are some metrics that are only available on Linux. See Data Collected for the list of metrics provided by this integration. The list also specifies which metrics are Linux-only.

For details on collecting integration metrics, see Integration Setup for ECS Fargate.

DogStatsD

Metrics are collected with DogStatsD through UDP port 8125.

Other environment variables

For environment variables available with the Docker Agent container, see the Docker Agent page. Note: Some variables are not be available for Fargate.

Environment Variable	Description
`DD_TAGS`	Add tags. For example: `key1:value1 key2:value2`.
`DD_DOCKER_LABELS_AS_TAGS`	Extract docker container labels
`DD_CHECKS_TAG_CARDINALITY`	Add tags to check metrics
`DD_DOGSTATSD_TAG_CARDINALITY`	Add tags to custom metrics

For global tagging, it is recommended to use DD_DOCKER_LABELS_AS_TAGS. With this method, the Agent pulls in tags from your container labels. This requires you to add the appropriate labels to your other containers. Labels can be added directly in the task definition.

Format for the Agent container:

{
  "name": "DD_DOCKER_LABELS_AS_TAGS",
  "value": "{\"<LABEL_NAME_TO_COLLECT>\":\"<TAG_KEY_FOR_DATADOG>\"}"
}

Example for the Agent container:

{
  "name": "DD_DOCKER_LABELS_AS_TAGS",
  "value": "{\"com.docker.compose.service\":\"service_name\"}"
}

CloudFormation example (YAML):

      ContainerDefinitions:
        - #(...)
          Environment:
            - Name: DD_DOCKER_LABELS_AS_TAGS
              Value: "{\"com.docker.compose.service\":\"service_name\"}"

Note: You should not use DD_HOSTNAME since there is no concept of a host to the user in Fargate. Using this tag can cause your tasks to appear as APM Hosts in the Infrastructure list, potentially impacting your billing. Instead, DD_TAGS is traditionally used to assign host tags. As of Datadog Agent version 6.13.0, you can also use the DD_TAGS environment variable to set global tags on your integration metrics.

Crawler-based metrics

In addition to the metrics collected by the Datadog Agent, Datadog has a CloudWatch based ECS integration. This integration collects the Amazon ECS CloudWatch Metrics.

As noted there, Fargate tasks also report metrics in this way:

The metrics made available will depend on the launch type of the tasks and services in your clusters or batch jobs. If you are using the Fargate launch type for your services then CPU and memory utilization metrics are provided to assist in the monitoring of your services.

Since this method does not use the Datadog Agent, you need to configure the AWS integration by checking ECS on the integration tile. Then, Datadog pulls these CloudWatch metrics (namespaced aws.ecs.* in Datadog) on your behalf. See the Data Collected section of the documentation.

If these are the only metrics you need, you could rely on this integration for collection using CloudWatch metrics. Note: CloudWatch data is less granular (1-5 min depending on the type of monitoring you have enabled) and delayed in reporting to Datadog. This is because the data collection from CloudWatch must adhere to AWS API limits, instead of pushing it to Datadog with the Agent.

Datadog’s default CloudWatch crawler polls metrics once every 10 minutes. If you need a faster crawl schedule, contact Datadog support for availability. Note: There are cost increases involved on the AWS side as CloudWatch bills for API calls.

Log collection

You can monitor Fargate logs by using either:

The AWS FireLens integration built on Datadog’s Fluent Bit output plugin to send logs directly to Datadog
Using the awslogs log driver to store the logs in a CloudWatch Log Group, and then a Lambda function to route logs to Datadog

Datadog recommends using AWS FireLens for the following reasons:

You can configure Fluent Bit directly in your Fargate tasks.
The Datadog Fluent Bit output plugin provides additional tagging on logs. The ECS Explorer uses the tags to correlate logs with ECS resources.

Fluent Bit and FireLens

Configure the AWS FireLens integration built on Datadog’s Fluent Bit output plugin to connect your FireLens monitored log data to Datadog Logs. You can find a full sample task definition for this configuration here.

Add the Fluent Bit FireLens log router container in your existing Fargate task. For more information about enabling FireLens, see the dedicated AWS Firelens docs. For more information about Fargate container definitions, see the AWS docs on Container Definitions. AWS recommends that you use the regional Docker image. Here is an example snippet of a task definition where the Fluent Bit image is configured:
```
{
  "essential": true,
  "image": "amazon/aws-for-fluent-bit:stable",
  "name": "log_router",
  "firelensConfiguration": {
    "type": "fluentbit",
    "options": { "enable-ecs-log-metadata": "true" }
  }
}
```
If your containers are publishing serialized JSON logs over stdout, you should use this extra FireLens configuration to get them correctly parsed within Datadog:
```
{
  "essential": true,
  "image": "amazon/aws-for-fluent-bit:stable",
  "name": "log_router",
  "firelensConfiguration": {
    "type": "fluentbit",
    "options": {
      "enable-ecs-log-metadata": "true",
      "config-file-type": "file",
      "config-file-value": "/fluent-bit/configs/parse-json.conf"
    }
  }
}
```
This converts serialized JSON from the log: field into top-level fields. See the AWS sample Parsing container stdout logs that are serialized JSON for more details.
Next, in the same Fargate task define a log configuration for the desired containers to ship logs. This log configuration should have AWS FireLens as the log driver, and with data being output to Fluent Bit. Here is an example snippet of a task definition where the FireLens is the log driver, and it is outputting data to Fluent Bit:

{
  "logConfiguration": {
    "logDriver": "awsfirelens",
    "options": {
      "Name": "datadog",
      "apikey": "<DATADOG_API_KEY>",
      "Host": "http-intake.logs.datadoghq.com",
      "dd_service": "firelens-test",
      "dd_source": "redis",
      "dd_message_key": "log",
      "dd_tags": "region:us-west-2,project:fluentbit",
      "TLS": "on",
      "provider": "ecs"
    }
  }
}

{
  "logConfiguration": {
    "logDriver": "awsfirelens",
    "options": {
      "Name": "datadog",
      "apikey": "<DATADOG_API_KEY>",
      "Host": "http-intake.logs.us3.datadoghq.com",
      "dd_service": "firelens-test",
      "dd_source": "redis",
      "dd_message_key": "log",
      "dd_tags": "region:us-west-2,project:fluentbit",
      "TLS": "on",
      "provider": "ecs"
    }
  }
}

{
  "logConfiguration": {
    "logDriver": "awsfirelens",
    "options": {
      "Name": "datadog",
      "apikey": "<DATADOG_API_KEY>",
      "Host": "http-intake.logs.us5.datadoghq.com",
      "dd_service": "firelens-test",
      "dd_source": "redis",
      "dd_message_key": "log",
      "dd_tags": "region:us-west-2,project:fluentbit",
      "TLS": "on",
      "provider": "ecs"
    }
  }
}

{
  "logConfiguration": {
    "logDriver": "awsfirelens",
    "options": {
      "Name": "datadog",
      "apikey": "<DATADOG_API_KEY>",
      "Host": "http-intake.logs.datadoghq.eu",
      "dd_service": "firelens-test",
      "dd_source": "redis",
      "dd_message_key": "log",
      "dd_tags": "region:us-west-2,project:fluentbit",
      "TLS": "on",
      "provider": "ecs"
    }
  }
}

{
  "logConfiguration": {
    "logDriver": "awsfirelens",
    "options": {
      "Name": "datadog",
      "apikey": "<DATADOG_API_KEY>",
      "Host": "http-intake.logs.ap1.datadoghq.com",
      "dd_service": "firelens-test",
      "dd_source": "redis",
      "dd_message_key": "log",
      "dd_tags": "region:us-west-2,project:fluentbit",
      "TLS": "on",
      "provider": "ecs"
    }
  }
}

{
  "logConfiguration": {
    "logDriver": "awsfirelens",
    "options": {
      "Name": "datadog",
      "apikey": "<DATADOG_API_KEY>",
      "Host": "http-intake.logs.ddog-gov.datadoghq.com",
      "dd_service": "firelens-test",
      "dd_source": "redis",
      "dd_message_key": "log",
      "dd_tags": "region:us-west-2,project:fluentbit",
      "TLS": "on",
      "provider": "ecs"
    }
  }
}

Note: Separate tags with commas in the dd_tags field.

Example using secretOptions to avoid exposing the API Key in plain text

class=chroma>

{ "logConfiguration": { "logDriver": "awsfirelens", "options": { "Name": "datadog", "Host": "http-intake.logs.datadoghq.com", "dd_service": "firelens-test", "dd_source": "redis", "dd_message_key": "log", "dd_tags": "region:us-west-2,project:fluentbit", "TLS": "on", "provider": "ecs" }, "secretOptions": [ { "name": "apikey", "valueFrom": "<API_SECRET_ARN>" } ] class=cl> } class=cl>}

{
  "logConfiguration": {
    "logDriver": "awsfirelens",
    "options": {
      "Name": "datadog",
      "Host": "http-intake.logs.us3.datadoghq.com",
      "dd_service": "firelens-test",
      "dd_source": "redis",
      "dd_message_key": "log",
      "dd_tags": "region:us-west-2,project:fluentbit",
      "TLS": "on",
      "provider": "ecs"
    },
    "secretOptions": [
    {
      "name": "apikey",
      "valueFrom": "<API_SECRET_ARN>"
    }
  ]
  }
}

{
  "logConfiguration": {
    "logDriver": "awsfirelens",
    "options": {
      "Name": "datadog",
      "Host": "http-intake.logs.us5.datadoghq.com",
      "dd_service": "firelens-test",
      "dd_source": "redis",
      "dd_message_key": "log",
      "dd_tags": "region:us-west-2,project:fluentbit",
      "TLS": "on",
      "provider": "ecs"
    },
    "secretOptions": [
    {
      "name": "apikey",
      "valueFrom": "<API_SECRET_ARN>"
    }
  ]
  }
}

{
  "logConfiguration": {
    "logDriver": "awsfirelens",
    "options": {
      "Name": "datadog",
      "Host": "http-intake.logs.datadoghq.eu",
      "dd_service": "firelens-test",
      "dd_source": "redis",
      "dd_message_key": "log",
      "dd_tags": "region:us-west-2,project:fluentbit",
      "TLS": "on",
      "provider": "ecs"
    },
    "secretOptions": [
    {
      "name": "apikey",
      "valueFrom": "<API_SECRET_ARN>"
    }
  ]
  }
}

{
  "logConfiguration": {
    "logDriver": "awsfirelens",
    "options": {
      "Name": "datadog",
      "Host": "http-intake.logs.ap1.datadoghq.com",
      "dd_service": "firelens-test",
      "dd_source": "redis",
      "dd_message_key": "log",
      "dd_tags": "region:us-west-2,project:fluentbit",
      "TLS": "on",
      "provider": "ecs"
    },
    "secretOptions": [
    {
      "name": "apikey",
      "valueFrom": "<API_SECRET_ARN>"
    }
  ]
  }
}

{
  "logConfiguration": {
    "logDriver": "awsfirelens",
    "options": {
      "Name": "datadog",
      "Host": "http-intake.logs.ddog-gov.datadoghq.com",
      "dd_service": "firelens-test",
      "dd_source": "redis",
      "dd_message_key": "log",
      "dd_tags": "region:us-west-2,project:fluentbit",
      "TLS": "on",
      "provider": "ecs"
    },
    "secretOptions": [
    {
      "name": "apikey",
      "valueFrom": "<API_SECRET_ARN>"
    }
  ]
  }
}

To provide your Datadog API key as a secret, see Using secrets.

The dd_service, dd_source, and dd_tags can be adjusted for your desired tags.

Whenever a Fargate task runs, Fluent Bit sends the container logs to Datadog with information about all of the containers managed by your Fargate tasks. You can see the raw logs on the Log Explorer page, build monitors for the logs, and use the Live Container view.

`Web UI`

To add the Fluent Bit container to your existing Task Definition check the Enable FireLens integration checkbox under Log router integration to automatically create the log_router container for you. This pulls the regional image, however, we do recommend to use the stable image tag instead of latest. Once you click Apply this creates the base container. To further customize the firelensConfiguration click the Configure via JSON button at the bottom to edit this manually.

After this has been added edit the application container in your Task Definition that you want to submit logs from and change the Log driver to awsfirelens filling in the Log options with the keys shown in the above example.

`AWS CLI`

Edit your existing JSON task definition file to include the log_router container and the updated logConfiguration for your application container, as described in the previous section. After this is done, create a new revision of your task definition with the following command:

aws ecs register-task-definition --cli-input-json file://<PATH_TO_FILE>/datadog-agent-ecs-fargate.json

`AWS CloudFormation`

To use AWS CloudFormation templating, use the AWS::ECS::TaskDefinition resource and set the Datadog option to configure log management.

For example, to configure Fluent Bit to send logs to Datadog:

Resources:
  ECSTaskDefinition:
    Type: 'AWS::ECS::TaskDefinition'
    Properties:
      NetworkMode: awsvpc
      RequiresCompatibilities:
          - FARGATE
      Cpu: 256
      Memory: 1GB
      ContainerDefinitions:
        - Name: tomcat-test
          Image: 'tomcat:jdk8-adoptopenjdk-openj9'
          LogConfiguration:
            LogDriver: awsfirelens
            Options:
              Name: datadog
              apikey: <DATADOG_API_KEY>
              Host: http-intake.logs.datadoghq.com
              dd_service: test-service
              dd_source: test-source
              TLS: 'on'
              provider: ecs
          MemoryReservation: 500
        - Name: log_router
          Image: 'amazon/aws-for-fluent-bit:stable'
          Essential: true
          FirelensConfiguration:
            Type: fluentbit
            Options:
              enable-ecs-log-metadata: true
          MemoryReservation: 50

Resources:
  ECSTaskDefinition:
    Type: 'AWS::ECS::TaskDefinition'
    Properties:
      NetworkMode: awsvpc
      RequiresCompatibilities:
          - FARGATE
      Cpu: 256
      Memory: 1GB
      ContainerDefinitions:
        - Name: tomcat-test
          Image: 'tomcat:jdk8-adoptopenjdk-openj9'
          LogConfiguration:
            LogDriver: awsfirelens
            Options:
              Name: datadog
              apikey: <DATADOG_API_KEY>
              Host: http-intake.logs.us3.datadoghq.com
              dd_service: test-service
              dd_source: test-source
              TLS: 'on'
              provider: ecs
          MemoryReservation: 500
        - Name: log_router
          Image: 'amazon/aws-for-fluent-bit:stable'
          Essential: true
          FirelensConfiguration:
            Type: fluentbit
            Options:
              enable-ecs-log-metadata: true
          MemoryReservation: 50

Resources:
  ECSTaskDefinition:
    Type: 'AWS::ECS::TaskDefinition'
    Properties:
      NetworkMode: awsvpc
      RequiresCompatibilities:
          - FARGATE
      Cpu: 256
      Memory: 1GB
      ContainerDefinitions:
        - Name: tomcat-test
          Image: 'tomcat:jdk8-adoptopenjdk-openj9'
          LogConfiguration:
            LogDriver: awsfirelens
            Options:
              Name: datadog
              apikey: <DATADOG_API_KEY>
              Host: http-intake.logs.us5.datadoghq.com
              dd_service: test-service
              dd_source: test-source
              TLS: 'on'
              provider: ecs
          MemoryReservation: 500
        - Name: log_router
          Image: 'amazon/aws-for-fluent-bit:stable'
          Essential: true
          FirelensConfiguration:
            Type: fluentbit
            Options:
              enable-ecs-log-metadata: true
          MemoryReservation: 50

Resources:
  ECSTaskDefinition:
    Type: 'AWS::ECS::TaskDefinition'
    Properties:
      NetworkMode: awsvpc
      RequiresCompatibilities:
          - FARGATE
      Cpu: 256
      Memory: 1GB
      ContainerDefinitions:
        - Name: tomcat-test
          Image: 'tomcat:jdk8-adoptopenjdk-openj9'
          LogConfiguration:
            LogDriver: awsfirelens
            Options:
              Name: datadog
              apikey: <DATADOG_API_KEY>
              Host: http-intake.logs.datadoghq.eu
              dd_service: test-service
              dd_source: test-source
              TLS: 'on'
              provider: ecs
          MemoryReservation: 500
        - Name: log_router
          Image: 'amazon/aws-for-fluent-bit:stable'
          Essential: true
          FirelensConfiguration:
            Type: fluentbit
            Options:
              enable-ecs-log-metadata: true
          MemoryReservation: 50

Resources:
  ECSTaskDefinition:
    Type: 'AWS::ECS::TaskDefinition'
    Properties:
      NetworkMode: awsvpc
      RequiresCompatibilities:
          - FARGATE
      Cpu: 256
      Memory: 1GB
      ContainerDefinitions:
        - Name: tomcat-test
          Image: 'tomcat:jdk8-adoptopenjdk-openj9'
          LogConfiguration:
            LogDriver: awsfirelens
            Options:
              Name: datadog
              apikey: <DATADOG_API_KEY>
              Host: http-intake.logs.ddog-gov.datadoghq.com
              dd_service: test-service
              dd_source: test-source
              TLS: 'on'
              provider: ecs
          MemoryReservation: 500
        - Name: log_router
          Image: 'amazon/aws-for-fluent-bit:stable'
          Essential: true
          FirelensConfiguration:
            Type: fluentbit
            Options:
              enable-ecs-log-metadata: true
          MemoryReservation: 50

For more information on CloudFormation templating and syntax, see the AWS CloudFormation documentation.

`Datadog ECS Fargate CDK Construct`

To enable logging through the Datadog ECS Fargate CDK construct, configure the logCollection property as seen below:

const ecsDatadog = new DatadogECSFargate({
  apiKey: <DATADOG_API_KEY>,
  site: <DATADOG_SITE>,
  logCollection: {
    isEnabled: true,
  }
});

`Datadog ECS Fargate Terraform Module`

To enable logging through the Datadog ECS Fargate Terraform module, configure the dd_log_collection input argument as seen below:

module "ecs_fargate_task" {
  source  = "DataDog/ecs-datadog/aws//modules/ecs_fargate"
  version = "1.0.0"

  # Configure Datadog
  dd_api_key = <DATADOG_API_KEY>
  dd_site    = <DATADOG_SITE>
  dd_log_collection = {
    enabled = true,
  }

  # Configure Task Definition
  family                   = <TASK_FAMILY>
  container_definitions    = <CONTAINER_DEFINITIONS>
  cpu                      = 256
  memory                   = 512
  network_mode             = "awsvpc"
  requires_compatibilities = ["FARGATE"]
}

Note: Use a TaskDefinition secret to avoid exposing the apikey in plain text.

`AWS log driver`

Monitor Fargate logs by using the awslogs log driver and a Lambda function to route logs to Datadog.

Define the log driver as awslogs in the application container in the task or job you want to collect logs from. Consult the AWS Fargate developer guide for instructions.
This configures your Fargate tasks or jobs to send log information to Amazon CloudWatch Logs. The following shows a snippet of a task/job definition where the awslogs log driver is configured:
```
{
  "logConfiguration": {
    "logDriver": "awslogs",
    "options": {
      "awslogs-group": "/ecs/fargate-task|job-definition",
      "awslogs-region": "us-east-1",
      "awslogs-stream-prefix": "ecs"
    }
  }
}
```
For more information about using the awslogs log driver in your task or job definitions to send container logs to CloudWatch Logs, see Using the awslogs Log Driver. This driver collects logs generated by the container and sends them to CloudWatch directly.
Finally, use the Datadog Lambda Log Forwarder function to collect logs from CloudWatch and send them to Datadog. To automatically enrich logs with ECS tags (task_arn, service_arn, cluster_arn, …), ensure the following configuration:
1. The CloudWatch Log Group must be named /ecs/<ECS_CLUSTER_NAME>.
2. The Log Stream must follow the default naming format: <awslogs-stream-prefix>/<container_name>/<task_id>.

`Trace collection`

Instrument your application based on your setup:
Note: With Fargate APM applications, do not set DD_AGENT_HOST - the default of localhost works.
Language
Java
Python
Ruby
Go
Node.js
PHP
C++
.NET Core
.NET Framework
See more general information about Sending Traces to Datadog.
Ensure your application is running in the same task or job definition as the Datadog Agent container.

Language
Java
Python
Ruby
Go
Node.js
PHP
C++
.NET Core
.NET Framework

`Process collection`

You can view your ECS Fargate processes in Datadog. To see their relationship to ECS Fargate containers, use the Datadog Agent v7.50.0 or later.

You can monitor processes in ECS Fargate in Datadog by using the Live Processes page. To enable process collection, add the PidMode parameter in the Task Definition and set it to task as follows:

"pidMode": "task"

To filter processes by ECS, use the AWS Fargate Containers facet or enter fargate:ecs in the search query on the Live Processes page.

`Out-of-the-box tags`

The Agent can autodiscover and attach tags to all data emitted by the entire task or an individual container within this task or job. The list of tags automatically attached depends on the Agent’s cardinality configuration.

Note: Set the env and service tags in your task definition to get the full benefits of Datadog’s unified service tagging. See the full configuration section of the unified service tagging documentation for instructions.

Tag	Cardinality	Source
`container_name`	High	ECS API
`container_id`	High	ECS API
`docker_image`	Low	ECS API
`image_name`	Low	ECS API
`short_image`	Low	ECS API
`image_tag`	Low	ECS API
`ecs_cluster_name`	Low	ECS API
`ecs_container_name`	Low	ECS API
`task_arn`	Orchestrator	ECS API
`task_family`	Low	ECS API
`task_name`	Low	ECS API
`task_version`	Low	ECS API
`availability-zone`	Low	ECS API
`region`	Low	ECS API

`Data Collected`

`Metrics`


ecs.fargate.cpu.limit (gauge)	Soft limit (CPU Shares) in CPU Units.
ecs.fargate.cpu.percent (gauge)	Percentage of CPU used per container (Linux only). Shown as percent
ecs.fargate.cpu.system (gauge)	System CPU time. Shown as nanocore
ecs.fargate.cpu.task.limit (gauge)	Task CPU Limit (shared by all containers). Shown as nanocore
ecs.fargate.cpu.usage (gauge)	Total CPU Usage. Shown as nanocore
ecs.fargate.cpu.user (gauge)	User CPU time. Shown as nanocore
ecs.fargate.ephemeral_storage.reserved (gauge)	The reserved ephemeral storage of this task. (Fargate 1.4.0+ required). Shown as mebibyte
ecs.fargate.ephemeral_storage.utilized (gauge)	The current ephemeral storage usage of this task. (Fargate 1.4.0+ required). Shown as mebibyte
ecs.fargate.io.bytes.read (gauge)	Number of bytes read on the disk. Shown as byte
ecs.fargate.io.bytes.write (gauge)	Number of bytes written to the disk. Shown as byte
ecs.fargate.io.ops.read (gauge)	Number of read operation on the disk.
ecs.fargate.io.ops.write (gauge)	Number of write operations to the disk.
ecs.fargate.mem.active_anon (gauge)	Number of bytes of anonymous and swap cache memory on active LRU list (Linux only). Shown as byte
ecs.fargate.mem.active_file (gauge)	Number of bytes of file-backed memory on active LRU list (Linux only). Shown as byte
ecs.fargate.mem.cache (gauge)	Number of bytes of page cache memory (Linux only). Shown as byte
ecs.fargate.mem.hierarchical_memory_limit (gauge)	Number of bytes of memory limit with regard to hierarchy under which the memory cgroup is (Linux only). Shown as byte
ecs.fargate.mem.hierarchical_memsw_limit (gauge)	Number of bytes of memory+swap limit with regard to hierarchy under which memory cgroup is (Linux only). Shown as byte
ecs.fargate.mem.inactive_file (gauge)	Number of bytes of file-backed memory on inactive LRU list (Linux only). Shown as byte
ecs.fargate.mem.limit (gauge)	Number of bytes memory limit (Linux only). Shown as byte
ecs.fargate.mem.mapped_file (gauge)	Number of bytes of mapped file (includes tmpfs/shmem) (Linux only). Shown as byte
ecs.fargate.mem.max_usage (gauge)	Show max memory usage recorded. Shown as byte
ecs.fargate.mem.pgfault (gauge)	Number of page faults per second (Linux only).
ecs.fargate.mem.pgmajfault (gauge)	Number of major page faults per second (Linux only).
ecs.fargate.mem.pgpgin (gauge)	Number of charging events to the memory cgroup. The charging event happens each time a page is accounted as either mapped anon page(RSS) or cache page(Page Cache) to the cgroup (Linux only).
ecs.fargate.mem.pgpgout (gauge)	Number of uncharging events to the memory cgroup. The uncharging event happens each time a page is unaccounted from the cgroup (Linux only).
ecs.fargate.mem.rss (gauge)	Number of bytes of anonymous and swap cache memory (includes transparent hugepages) (Linux only). Shown as byte
ecs.fargate.mem.task.limit (gauge)	Task Memory Limit (shared by all containers). Shown as byte
ecs.fargate.mem.usage (gauge)	Number of bytes of memory used. Shown as byte
ecs.fargate.net.bytes_rcvd (gauge)	Number of bytes received (Fargate 1.4.0+ required). Shown as byte
ecs.fargate.net.bytes_sent (gauge)	Number of bytes sent (Fargate 1.4.0+ required). Shown as byte
ecs.fargate.net.packet.in_dropped (gauge)	Number of ingoing packets dropped (Fargate 1.4.0+ required). Shown as packet
ecs.fargate.net.packet.out_dropped (gauge)	Number of outgoing packets dropped (Fargate 1.4.0+ required). Shown as packet
ecs.fargate.net.rcvd_errors (gauge)	Number of received errors (Fargate 1.4.0+ required). Shown as error
ecs.fargate.net.sent_errors (gauge)	Number of sent errors (Fargate 1.4.0+ required). Shown as error

`Events`

The ECS Fargate check does not include any events.

`Service Checks`

fargate_check

Returns CRITICAL if the Agent is unable to connect to Fargate, otherwise returns OK.

Statuses: ok, critical

`Troubleshooting`

`Agent does not start on a read-only filesystem`

If you experience issues starting the Agent on a filesystem with the setting "readonlyRootFilesystem": true, follow either of the approaches below to remediate this:

Use a Dockerfile like the example below to add the volume at the necessary path, and copy over the existing datadog.yaml file. The datadog.yaml file can have any content or be empty, but it must be present.

FROM gcr.io/datadoghq/agent:latest
VOLUME /etc/datadog-agent
ADD datadog.yaml /etc/datadog-agent/datadog.yaml

Build the container image. Datadog recommends tagging it with the version and type; for example, docker.io/example/agent:7.62.2-rofs (read only file system).
Reference the image in your task definition, as shown in the example below.
Set "readonlyRootFilesystem": true on the Agent container, as shown in the example below.

    "containerDefinitions": [
        {
            "name": "datadog-agent",
            "image": "docker.io/example/agent:7.62.2-rofs",
            ...
            "environment": [
                {
                    "name": "ECS_FARGATE",
                    "value": "true"
                },
                {
                    "name": "DD_API_KEY",
                    "value": "<API_KEY>"
                }
            ]
            "readonlyRootFilesystem": true
        },
        {
            "name": "example-app-container",
            "image": "example-image",
            ...
        }
    ]

If you cannot build a custom Agent image, you can follow the steps below to add an empty volume dynamically to the Agent.

This configuration deletes all the preexisting files in the /etc/datadog-agent folder, including:
- All the Autodiscovery config files (/auto_conf.yaml)
- JMX metrics.yaml files
- The main ECS Fargate /etc/datadog-agent/conf.d/ecs_fargate.d/conf.yaml.default file

As such, you must set up the integration with Autodiscovery Docker labels on the Datadog Agent container. This requires setting the ignore_autodiscovery_tag: true flag in the configuration. Otherwise, metrics from the app container are double-tagged with the Agent container's tags.

Create an empty volume for the Agent container to use. In the example below, this is named agent_conf.
Add this volume to the Agent’s task definition.
Set "readonlyRootFilesystem": "true" on the Agent container.
Add dockerLabels to have the Agent start the ecs_fargate check manually.

The example below displays this configuration:

    "containerDefinitions": [
        {
            "name": "datadog-agent",
            "image": "public.ecr.aws/datadog/agent:latest",
            ...
            "environment": [
                {
                    "name": "ECS_FARGATE",
                    "value": "true"
                },
                {
                    "name": "DD_API_KEY",
                    "value": "<API_KEY>"
                }
            ],
            "mountPoints": [
                {
                    "sourceVolume": "agent_conf",
                    "containerPath": "/etc/datadog-agent",
                    "readOnly": false
                }
            ],
            "readonlyRootFilesystem": true,
            "dockerLabels": {
                "com.datadoghq.ad.checks": "{\"ecs_fargate\":{\"ignore_autodiscovery_tags\":true,\"instances\":[{}]}}"
            }
        },
        {
            "name": "example-app-container",
            "image": "example-image",
            ...
        }
    ],
    "volumes": [
        {
            "name": "agent_conf",
            "host": {}
        }
    ]

Need help? Contact Datadog support.

`Further Reading`

Additional helpful documentation, links, and articles:

Monitor ECS applications on AWS Fargate with DatadogBLOG


Integration Setup for ECS FargateDOCUMENTATION


Monitor your Fargate container logs with FireLens and DatadogBLOG


Key metrics for monitoring AWS FargateBLOG


How to collect metrics and logs from AWS Fargate workloadsBLOG


AWS Fargate monitoring with DatadogBLOG