Datadog-AWS Lambda Integration

Overview

Amazon Lambda is a compute service that runs code in response to events and automatically manages the compute resources required by that code.

Enable this integration to begin collecting Cloudwatch & custom metrics from your Lambda functions.

Setup

Installation

If you haven’t already, set up the Amazon Web Services integration.

Configuration

In the Amazon Web Services integration tile, ensure that Lambda is checked under metric collection.

The following permissions are also required to use the Lambda integration:

  • lambda:List*: List Lambda functions, metadata, and tags.
  • logs:DescribeLogGroups: List available log groups.
  • logs:DescribeLogStreams: List available log streams for a group.
  • logs:FilterLogEvents: Fetch specific log events for a stream to generate metrics.
  • tag:getResources: Get custom tags applied to Lambda functions.

To send custom metrics to Datadog, you must print a log line from your Lambda, using the following format:

MONITORING|<unix_epoch_timestamp>|<value>|<metric_type>|<metric_name>|#<tag_list>

Where:

  • MONITORING signals to the Datadog integration that it should collect this log entry

  • <unix_epoch_timestamp> is in seconds, not milliseconds

  • <value> MUST be a number (i.e. integer or float)

  • <metric_type> is count, gauge, histogram, or check

  • <metric_name> uniquely identifies your metric and adheres to the metric naming policy

  • <tag_list> is optional, comma separated, and must be preceded by #.
    The tag function_name:<name_of_the_function> will automatically be applied to custom metrics

Sample snippets (in Python):

Count/Gauge
unix_epoch_timestamp = int(time.time())
value = 42
metric_type = 'count'
metric_name = 'my.metric.name'
tags = ['tag1:value', 'tag2']

print('MONITORING|{0}|{1}|{2}|{3}|#{4}'.format(
    unix_epoch_timestamp, value, metric_type, metric_name, ','.join(tags)
))
Histogram
unix_epoch_timestamp = int(time.time())
metric_type = 'histogram'
metric_name = 'my.metric.name.hist'
tags = ['tag1:value', 'tag2']

for i in xrange(0,10):
	print('MONITORING|{0}|{1}|{2}|{3}|#{4}'.format(
    	unix_epoch_timestamp, i, metric_type, metric_name, ','.join(tags)
))
Using the histogram metric type provides avg, count, max, min, 95p, and median values. These values are calculated at one second granularity.
Service Check
unix_epoch_timestamp = int(time.time())
value = 1 # WARNING
metric_type = 'check'
metric_name = 'my.metric.name.check'

print('MONITORING|{0}|{1}|{2}|{3}'.format(
	timestamp, value, metric_type, metric_name
))

Data Collected

Metrics

aws.lambda.duration
(gauge)
Measures the average elapsed wall clock time from when the function code starts executing as a result of an invocation to when it stops executing.
shown as millisecond
aws.lambda.duration.maximum
(gauge)
Measures the maximum elapsed wall clock time from when the function code starts executing as a result of an invocation to when it stops executing.
shown as millisecond
aws.lambda.duration.minimum
(gauge)
Measures the minimum elapsed wall clock time from when the function code starts executing as a result of an invocation to when it stops executing.
shown as millisecond
aws.lambda.duration.sum
(gauge)
Measures the total execution time of the lambda function executing.
shown as millisecond
aws.lambda.errors
(count)
Measures the number of invocations that failed due to errors in the function (response code 4XX).
shown as
aws.lambda.invocations
(count)
Measures the number of times a function is invoked in response to an event or invocation API call.
shown as
aws.lambda.throttles
(count)
Measures the number of Lambda function invocation attempts that were throttled due to invocation rates exceeding the customer’s concurrent limits (error code 429). Failed invocations may trigger a retry attempt that succeeds.
shown as

The metrics above get tagged in Datadog with any tags from AWS, including (but not limited to) function name, security-groups, and more.

Custom metrics only get tagged with function name.