Amazon ECS

Overview

Amazon ECS is a scalable, high-performance container orchestration service that supports Docker containers. With the Datadog Agent, you can monitor ECS containers and tasks on every EC2 instance in your cluster.

This page covers Amazon ECS setup with the Datadog Container Agent. For other setups, see:

Note: If you are looking to set up ECS on Fargate, see Amazon ECS on AWS Fargate instructions. The Datadog Agent container deployed on EC2 instances cannot monitor Fargate Tasks. Additionally, AWS Batch is not supported.

Setup

The Datadog Agent in ECS should be deployed as a container once on every EC2 instance in your ECS cluster. This is done by creating a Task Definition for the Datadog Agent container and deploying it as a Daemon service. Each Datadog Agent container then monitors the other containers on their respective EC2 instances.

If you don’t have a working EC2 Container Service cluster configured, review the Getting Started section in the ECS documentation to set up and configure a cluster. Once configured, follow the setup instructions below.

  1. Create and add an ECS Task Definition
  2. Schedule the Datadog Agent as a Daemon Service
  3. Optional Setup the additional Datadog Agent features

Note: Datadog’s Autodiscovery can be used in conjunction with ECS and Docker to automatically discover and monitor running tasks in your environment.

Create an ECS task

The Task Definition launches the Datadog Agent container with the necessary configurations. When you need to modify the Agent configuration, update this Task Definition and redeploy the Daemon Service as needed. You can configure the Task Definition using either the AWS CLI tools or using the Amazon Web Console.

The following sample is a minimal configuration for core infrastructure monitoring. However, additional Task Definition samples with various features enabled are provided in the Setup additional Agent features section if you want to use those instead.

Managing the task definition file

  1. For Linux containers, download datadog-agent-ecs.json

    1. If you are using an original Amazon Linux 1 AMI use datadog-agent-ecs1.json
    2. If you are using Windows use datadog-agent-ecs-win.json
  2. Edit your base Task Definition file

    1. Set <YOUR_DATADOG_API_KEY> with the Datadog API key for your account.

    2. Set the DD_SITE environment variable to

      Note: If the DD_SITE environment variable is not explicitly set, it defaults to the US site datadoghq.com. If you are using one of the other sites (EU, US3, or US1-FED) and do not set this, it results in an invalid API key message. Use the documentation site selector to see documentation appropriate for the site you’re using.

  3. Optionally - Add the following to your ECS task definition to deploy on an ECS Anywhere cluster.

    "requiresCompatibilities": ["EXTERNAL"]
    
  4. Optionally - Add an Agent health check to your ECS Task Definition

    "healthCheck": {
      "retries": 3,
      "command": ["CMD-SHELL","agent health"],
      "timeout": 5,
      "interval": 30,
      "startPeriod": 15
    }
    

For all of these examples the DD_API_KEY environment variable can alternatively be populated by referencing the ARN of a “Plaintext” secret stored in AWS Secret Manager. Any additional tags can be added by the environment variable DD_TAGS.

Registering the task definition

Once you have your Task Definition file created you can execute the following command to register this in AWS.

aws ecs register-task-definition --cli-input-json <path to datadog-agent-ecs.json>

Once you have your Task Definition file created you can login to your AWS console to register this.

  1. Log in to your AWS Console and navigate to the Elastic Container Service section.
  2. Click on Task Definitions on the left side and click the button Create new Task Definition.
  3. Choose “EC2” as the launch type, alternatively you can choose “External” if you plan to deploy the agent task on an ECS Anywhere cluster
  4. Once on the “Configure task and container definitions” page scroll to the bottom and select Configure via JSON. From here you can copy and paste the configuration from your file.
  5. Click Save on the JSON tab
  6. You can make any additional changes from the page here or by repeating this Configure via JSON process
  7. Click Create at the bottom to register this Task Definition

Run the Agent as a daemon service

Ideally, you want one running Datadog Agent container on each EC2 instance. The easiest way to achieve this is to run the Datadog Agent Task Definition as a Daemon Service.

Schedule a daemon service in AWS using Datadog’s ECS task

  1. Log in to the AWS console and navigate to the ECS Clusters section. Click into your cluster you run the Agent on.
  2. Create a new service by clicking the Create button under Services.
  3. For launch type, select EC2 then the task definition created previously.
  4. For service type, select DAEMON, and enter a Service name. Click Next.
  5. Since the service runs once on each instance, you don’t need a load balancer. Select None. Click Next.
  6. Daemon services don’t need Auto Scaling, so click Next Step, and then Create Service.

Setup Additional Agent Features

The initial Task Definition provided above is a fairly minimal one. This Task Definition deploys an Agent container with a base configuration to collect core metrics about the containers in your ECS cluster. This Agent can also run Agent Integrations based on Docker Autodiscovery Labels discovered on your corresponding containers.

If you’re using:

DogStatsD

If you’re using DogStatsD you can add in a Host Port mapping for 8125/udp to your Dataog Agent’s container definition like:

"portMappings": [
  {
    "hostPort": 8125,
    "protocol": "udp",
    "containerPort": 8125
  }
]

as well as set the environment variable DD_DOGSTATSD_NON_LOCAL_TRAFFIC to true.

For APM and DogStatsD double check the security group settings on your EC2 instances. Make sure these ports are not open to the public. Datadog recommends using the host’s private IP to route data from the application containers to the Datadog Agent container.

Process collection

Live Container data is automatically collected by the Datadog Agent container. To collect Live Process information for all your containers and send it to Datadog updated your Task Definitions with the environment variable:

{
  "name": "DD_PROCESS_AGENT_ENABLED",
  "value": "true"
}

Network Performance Monitoring collection

This feature is available for Linux only

  1. Follow the above instructions to install the Datadog Agent.
  2. If you already have a task definition, update your datadog-agent-ecs.json file (datadog-agent-ecs1.json if you are using an original Amazon Linux AMI) with the following configuration:
{
  "containerDefinitions": [
    (...)
      "mountPoints": [
        (...)
        {
          "containerPath": "/sys/kernel/debug",
          "sourceVolume": "debug"
        },
        (...)
      ],
      "environment": [
        (...)
        {
          "name": "DD_SYSTEM_PROBE_ENABLED",
          "value": "true"
        }
      ],
      "linuxParameters": {
       "capabilities": {
         "add": [
           "SYS_ADMIN",
           "SYS_RESOURCE",
           "SYS_PTRACE",
           "NET_ADMIN",
           "NET_BROADCAST",
           "NET_RAW",
           "IPC_LOCK",
           "CHOWN"
         ]
       }
     },
  ],
  "requiresCompatibilities": [
   "EC2"
  ],
  "volumes": [
    (...)
    {
     "host": {
       "sourcePath": "/sys/kernel/debug"
     },
     "name": "debug"
    },
    (...)
  ],
  "family": "datadog-agent-task"
}

AWSVPC mode

For Agent v6.10+, awsvpc mode is supported for applicative containers, provided that security groups are set to allow the host instance’s security group to reach the applicative containers on relevant ports.

While it’s possible to run the Agent in awsvpc mode, it’s not the recommended setup, because it may be difficult to retrieve the ENI IP to reach the Agent for Dogstatsd metrics and APM traces.

Instead, run the Agent in bridge mode with port mapping to allow easier retrieval of host IP through the metadata server.

Troubleshooting

Need help? Contact Datadog support.

Further reading