Amazon EC2

Overview

Amazon Elastic Compute Cloud (Amazon EC2) is a web service that provides resizable compute capacity in the cloud. It is designed to make web-scale cloud computing easier for developers.

Enable this integration to see in Datadog all your EC2 metrics, and additional events like scheduled maintenances.

Setup

Installation

If you haven’t already, set up the Amazon Web Services integration first.

Configuration

  1. In the AWS integration page, ensure that EC2 is enabled under the Metric Collection tab.

  2. Add the following required permissions to your Datadog IAM policy in order to collect Amazon EC2 metrics. For more information, see the EC2 policies on the AWS website.

    AWS PermissionDescription
    ec2:DescribeInstanceStatusUsed by the ELB integration to assert the health of an instance. Used by the EC2 integration to describe the health of all instances.
    ec2:DescribeSecurityGroupsAdds SecurityGroup names and custom tags to ec2 instances.
    ec2:DescribeInstancesAdds tags to ec2 instances and ec2 cloudwatch metrics.
  3. Install the Datadog - Amazon EC2 integration.

Note: If you want to monitor a subset of your EC2 instances with Datadog, assign an AWS tag, such as datadog:true, to those EC2 instances. Then specify that tag in the Limit metric collection to specific resources textbox under the Metric Collection tab in your Datadog AWS integration page.

EC2 automuting

Datadog can proactively mute monitors related to the manual shutdown of EC2 instances and instance termination triggered by AWS autoscaling based on host statuses from the CloudWatch API. Automuted EC2 instances are listed on the Monitor Downtime page by checking Show automatically muted hosts.

Note, the EC2 integration must be installed for automuting to take effect. If metrics collection is limited to hosts with tags, only instances matching the tags are automuted.

To silence monitors for expected EC2 instance shutdowns, check the EC2 automuting box in the AWS integration page:

Amazon EC2 Automuting

Install the Agent

Datadog provides two approaches for setting up the Datadog Agent on EC2 instances. See Why should I install the Datadog Agent on my cloud instances? to learn the benefit of installing the Agent on your Amazon EC2 instances.

Follow the steps below to install the Datadog Agent on EC2 instances with AWS Systems Manager.

  1. Configure the IAM role on your EC2 instances so that the AmazonSSMManagedInstanceCore permission is enabled.

  2. Navigate to the document tab of AWS SSM.

  3. Search for datadog. Note: You may need to find the correct document for your region by switching regions in top navigation bar of the AWS Management console.

  4. Choose either the Linux or Windows document depending on your needs.

  • Linux: datadog-agent-installation-linux
  • Windows: datadog-agent-installation-windows
  1. Fill in the command parameters.
  2. Select the target instances to install the Agent on.
  3. Click Run.
  4. Wait for the confirmation status to finish, then check the Infrastructure list in Datadog.

Alternative custom Agent installation

Parameter store

In the Parameter store, create a parameter with:

  • Name: dd-api-key-for-ssm
  • Description: (Optional)
  • Type: SecureString
  • KMS key source: My current account
  • KMS Key ID: Use the default value selected
  • Value: Your Datadog API key
Documents

In the systems manager, create a new Document:

  • Name: dd-agent-install
  • Target type: (Optional)
  • Document type: Command document
  • Content: JSON

If you are in Datadog US site, use the file dd-agent-install-us-site.json updated with your <AWS_REGION> under runCommand, such as us-east-1. If you are in Datadog EU site, use the dd-agent-install-eu-site.json instead.

Run command

Under Run Command, click the Run command button and follow the steps below:

  • Command document:
    • Click on the search box and select Owner -> Owned by me.
    • Click the radio button next to your document.
    • If necessary, choose the Document version.
  • Targets:
    • Select the EC2 instance to target.
  • Output options (optional):
    • Select the CloudWatch output checkbox to log any issues.
  • Other sections (optional):
    • Modify other sections as needed for your setup.

Click the Run button and a confirmation page displays showing the status. Wait for it to finish, then check the Infrastructure list in Datadog.

Agent installation through EC2 Image Builder

Datadog publishes an EC2 Image Builder component for the Datadog Agent through the AWS Marketplace. Users can subscribe to the product and use the Image Builder component to build a custom AMI.

Follow these steps to create a custom Amazon Machine Image with the Datadog Agent and provision EC2 instances with a pre-installed Datadog Agent.

For the initial release, the component was tested with Amazon Linux 2023. It should work with any Linux distribution that supports the Datadog Agent
Create a subscription
  1. Navigate to the EC2 Image Builder console and go to ‘Discover products’.
  2. Select the Components tab and search for Datadog Agent.
  3. Click View subscription options and follow the prompts to create a subscription.

See Managing AWS Marketplace Subscriptions for more details.

Create an image recipe
  1. Navigate to the Image recipes in the EC2 Image Builder console.
  2. Create a new recipe using following settings:
    • Base image - arn:aws:imagebuilder:us-east-1:aws:image/amazon-linux-2023-x86/x.x.x.
    • Component - arn:aws:imagebuilder:us-east-1:aws-marketplace:component/datadog-agent-for-linux-prod-wwo2b4p7dgrkk/0.1.0/1
    • Optionally, configure component parameters. This document assumes defaults are used.

See EC2 Image Builder Recipes for more details.

Create an Image Pipeline and Build the Image

Prerequisites:

  • The default role EC2InstanceProfileForImageBuilder requires the following additional permissions:
    • imagebuilder:GetMarketplaceResource to get the Datadog Agent component from Marketplace.
    • secretsmanager:GetSecretValue to retrieve API and application keys stored in the secrets store.
  • Create a secret named mp-ib-datadog-agent-secret that stores Datadog API and application keys mapped to dd-api-key and dd-app-key respectively.

Proceed to pipeline creation and image build:

  1. Navigate to the Image pipelines in the EC2 Image Builder console.
  2. Create a pipeline for the recipe. This is a multi-step wizard; the following covers the simplest scenario:
    • Step 1 - Provide pipeline name and set build schedule to manual.
    • Step 2 - Choose the recipe created in the previous section.
    • Step 3 - Leave defaults.
    • Step 4 - Leave the default option to use the role EC2InstanceProfileForImageBuilder with additional policies attached.
    • Step 5 - Leave defaults.
    • Step 6 - Review and create.
  3. Navigate to the newly created pipeline and run it.
  4. After the pipeline finishes, a summary shows the new image ARN.
  5. If you have set up your mp-ib-datadog-agent-secret secret correctly, the Datadog Agent starts reporting metrics shortly after the EC2 instance starts with the image.

See EC2 Image Builder Pipelines for more details.

Component Parameteres

Agent can be customized using following parameters in the recipe:

  • DD_SITE - site to send telemetry data to. Default: datadoghq.com.
  • HOST_TAGS - host tags. Default: installer:ec2_image_builder.
  • SM_SECRET_NAME - name of the secret for storing API and application keys. Default: mp-ib-datadog-agent-secret.
  • SM_API_KEY - key to look up API key in the secret. Default: dd-api-key
  • SM_API_KEY - key to look up application key in the secret. Default: dd-app-key

Log collection

Use the Datadog Agent or another log shipper to send your logs to Datadog.

Data Collected

Metrics

aws.ec2.cpucredit_balance
(gauge)
Number of CPU credits that an instance has accumulated.
Shown as unit
aws.ec2.cpucredit_usage
(gauge)
Number of CPU credits consumed.
Shown as unit
aws.ec2.cpusurplus_credit_balance
(gauge)
The number of surplus credits that have been spent by an unlimited instance when its CPUCreditBalance value is zero.
Shown as unit
aws.ec2.cpusurplus_credits_charged
(gauge)
The number of spent surplus credits that are not paid down by earned CPU credits, and which thus incur an additional charge.
Shown as unit
aws.ec2.cpuutilization
(gauge)
Average percentage of allocated EC2 compute units that are currently in use on the instance.
Shown as percent
aws.ec2.cpuutilization.maximum
(gauge)
Maximum percentage of allocated EC2 compute units that are currently in use on the instance.
Shown as percent
aws.ec2.disk_read_bytes
(gauge)
Bytes read from all ephemeral disks available to the instance.
Shown as byte
aws.ec2.disk_read_ops
(gauge)
Completed read operations from all ephemeral disks available to the instance.
Shown as operation
aws.ec2.disk_write_bytes
(gauge)
Bytes written to all ephemeral disks available to the instance.
Shown as byte
aws.ec2.disk_write_ops
(gauge)
Completed write operations to all ephemeral disks available to the instance.
Shown as operation
aws.ec2.ebsbyte_balance
(gauge)
Percentage of throughput credits remaining in the burst bucket for Nitro-based instances.
Shown as percent
aws.ec2.ebsiobalance
(gauge)
Percentage of I/O credits remaining in the burst bucket for Nitro-based instances."
Shown as percent
aws.ec2.ebsread_bytes
(gauge)
Average bytes read from all EBS volumes attached to the instance for Nitro-based instances.
Shown as byte
aws.ec2.ebsread_bytes.sum
(gauge)
Total bytes read from all EBS volumes attached to the instance for Nitro-based instances.
Shown as byte
aws.ec2.ebsread_ops
(count)
Average completed read operations from all Amazon EBS volumes attached to the instance for Nitro-based instances.
Shown as operation
aws.ec2.ebsread_ops.sum
(count)
Total completed read operations from all Amazon EBS volumes attached to the instance for Nitro-based instances.
Shown as operation
aws.ec2.ebswrite_bytes
(gauge)
Average bytes written to all EBS volumes attached to the instance for Nitro-based instances.
Shown as byte
aws.ec2.ebswrite_bytes.sum
(gauge)
Total bytes written to all EBS volumes attached to the instance for Nitro-based instances.
Shown as byte
aws.ec2.ebswrite_ops
(gauge)
Average completed write operations to all EBS volumes attached to the instance for Nitro-based instances.
Shown as operation
aws.ec2.ebswrite_ops.sum
(gauge)
Total completed write operations to all EBS volumes attached to the instance for Nitro-based instances.
Shown as operation
aws.ec2.host_ok
(gauge)
1 if the instance's system status is ok.
aws.ec2.instance_age
(gauge)
Time since instance launch
Shown as second
aws.ec2.network_address_usage
(gauge)
The maximum number of NAU units for a VPC.
Shown as unit
aws.ec2.network_address_usage_peered
(gauge)
The maximum number of NAU units for a VPC and all of its peered VPCs.
Shown as unit
aws.ec2.network_in
(gauge)
Average number of bytes received on all network interfaces by the instance.
Shown as byte
aws.ec2.network_in.maximum
(gauge)
Maximum number of bytes received on all network interfaces by the instance.
Shown as byte
aws.ec2.network_out
(gauge)
Average number of bytes sent out on all network interfaces by the instance.
Shown as byte
aws.ec2.network_out.maximum
(gauge)
Maximum number of bytes sent out on all network interfaces by the instance.
Shown as byte
aws.ec2.network_packets_in
(gauge)
Number of packets received on all network interfaces by the instance
Shown as packet
aws.ec2.network_packets_out
(gauge)
Number of packets sent out on all network interfaces by the instance
Shown as packet
aws.ec2.status_check_failed
(gauge)
1 if one of the status checks failed.
aws.ec2.status_check_failed_instance
(gauge)
0 if the instance has passed the EC2 instance status check.
aws.ec2.status_check_failed_system
(gauge)
0 if the instance has passed the EC2 system status check.

Each of the metrics retrieved from AWS is assigned the same tags that appear in the AWS console, including but not limited to host name, security-groups, and more.

Notes:

  • aws.ec2.instance_age is not collected by default with the Datadog - EC2 integration. Contact Datadog support to enable this metric collection.
  • aws.ec2.host_ok is collected by default, even if you disable metric collection for the Amazon EC2 integration, and can lead to unexpected hosts appearing in the infrastructure list. To ensure only desired hosts are monitored, assign an AWS tag, such as datadog:true, to those EC2 instances. Then specify that tag in the Limit metric collection to specific resources textbox under the Metric Collection tab in your Datadog AWS integration page.

Service Checks

aws.ec2.host_status
Returns your EC2 instance statuses as reported by the AWS console. Returns CRITICAL when there is a problem with your instance. Returns UNKNOWN when AWS does not have sufficient data to run a status check. Returns OK when your instance is running or is shut down properly.
Statuses: ok, critical, unknown

Out-of-the-box monitoring

The Amazon EC2 integration provides ready-to-use monitoring capabilities to monitor and optimize performance.

Troubleshooting

Need help? Contact Datadog support.

Further Reading