Amazon Step Functions (States) enables you to coordinate the components of distributed applications and microservices using visual workflows.
Enable this integration to see all your Step Functions metrics in Datadog.
If you haven’t already, set up the Amazon Web Services integration first. Then, add the following permissions to the policy document for your AWS/Datadog Role:
states:ListStateMachines,
states:DescribeStateMachine
Step Functions (States)
is checked under metric collection. If your state machines use AWS Lambda, also ensure that Lambda
is checked.If your Step Functions states are Lambda functions, installing this integration will add additional tags to your Lambda metrics. This lets you see which state machines your Lambda functions belong to, and you can visualize this on the Serverless page.
Configure Amazon Step Functions to send logs either to a S3 bucket or to Cloudwatch.
Note: If you log to a S3 bucket, make sure that amazon_step_functions
is set as Target prefix.
If you haven’t already, set up the Datadog log collection AWS Lambda function.
Once the lambda function is installed, manually add a trigger on the S3 bucket or Cloudwatch log group that contains your Amazon Step Functions logs in the AWS console:
To enable distributed tracing for your AWS Step Functions:
aws.states.execution_time (gauge) | The average time interval, in milliseconds, between the time the execution started and the time it closed. Shown as millisecond |
aws.states.execution_time.maximum (gauge) | The maximum time interval, in milliseconds, between the time the execution started and the time it closed. Shown as millisecond |
aws.states.execution_time.minimum (gauge) | The minimum time interval, in milliseconds, between the time the execution started and the time it closed. Shown as millisecond |
aws.states.execution_time.p95 (gauge) | The 95th percentile time interval, in milliseconds, between the time the execution started and the time it closed. Shown as millisecond |
aws.states.execution_time.p99 (gauge) | The 99th percentile time interval, in milliseconds, between the time the execution started and the time it closed.il Shown as millisecond |
aws.states.executions_aborted (count) | The number of executions that were aborted/terminated. |
aws.states.execution_throttled (count) | The number of StateEntered events in addition to retries |
aws.states.executions_failed (count) | The number of executions that failed. |
aws.states.executions_started (count) | The number of executions started. |
aws.states.executions_succeeded (count) | The number of executions that completed successfully. |
aws.states.executions_timed_out (count) | The number of executions that timed out for any reason. |
aws.states.lambda_function_run_time (gauge) | The average time interval, in milliseconds, between the time the lambda function was started and when it was closed. Shown as millisecond |
aws.states.lambda_function_run_time.maximum (gauge) | The maximum time interval, in milliseconds, between the time the lambda function was started and when it was closed. Shown as millisecond |
aws.states.lambda_function_run_time.minimum (gauge) | The minimum time interval, in milliseconds, between the time the lambda function was started and when it was closed. Shown as millisecond |
aws.states.lambda_function_run_time.p95 (gauge) | The 95th percentile time interval, in milliseconds, between the time the lambda function was started and when it was closed. Shown as millisecond |
aws.states.lambda_function_run_time.p99 (gauge) | The 99th percentile time interval, in milliseconds, between the time the lambda function was started and when it was closed. Shown as millisecond |
aws.states.lambda_function_schedule_time (gauge) | The avg time interval, in milliseconds, that the activity stayed in the schedule state. Shown as millisecond |
aws.states.lambda_function_schedule_time.maximum (gauge) | The maximum time interval, in milliseconds, that the activity stayed in the schedule state. Shown as millisecond |
aws.states.lambda_function_schedule_time.minimum (gauge) | The minimum time interval, in milliseconds, that the activity stayed in the schedule state. Shown as millisecond |
aws.states.lambda_function_schedule_time.p95 (gauge) | The 95th percentile time interval, in milliseconds, that the activity stayed in the schedule state. Shown as millisecond |
aws.states.lambda_function_schedule_time.p99 (gauge) | The 99th percentile time interval, in milliseconds, that the activity stayed in the schedule state. Shown as millisecond |
aws.states.lambda_function_time (gauge) | The average time interval, in milliseconds, between the time the lambda function was scheduled and when it was closed. Shown as millisecond |
aws.states.lambda_function_time.maximum (gauge) | The maximum time interval, in milliseconds, between the time the lambda function was scheduled and when it was closed. Shown as millisecond |
aws.states.lambda_function_time.minimum (gauge) | The minimum time interval, in milliseconds, between the time the lambda function was scheduled and when it was closed. Shown as millisecond |
aws.states.lambda_function_time.p95 (gauge) | The 95th percentile time interval, in milliseconds, between the time the lambda function was scheduled and when it was closed. Shown as millisecond |
aws.states.lambda_function_time.p99 (gauge) | The 99th percentile time interval, in milliseconds, between the time the lambda function was scheduled and when it was closed. Shown as millisecond |
aws.states.lambda_functions_failed (count) | The number of lambda functions that failed. |
aws.states.lambda_functions_heartbeat_timed_out (count) | The number of lambda functions that were timed out due to a heartbeat timeout. |
aws.states.lambda_functions_scheduled (count) | The number of lambda functions that were scheduled. |
aws.states.lambda_functions_started (count) | The number of lambda functions that were started. |
aws.states.lambda_functions_succeeded (count) | The number of lambda functions that completed successfully. |
aws.states.lambda_functions_timed_out (count) | The number of lambda functions that were timed out on close. |
aws.states.activity_run_time (gauge) | The average time interval, in milliseconds, between the time the activity was started and when it was closed. Shown as millisecond |
aws.states.activity_run_time.maximum (gauge) | The maximum time interval, in milliseconds, between the time the activity was started and when it was closed. Shown as millisecond |
aws.states.activity_run_time.minimum (gauge) | The minimum time interval, in milliseconds, between the time the activity was started and when it was closed. Shown as millisecond |
aws.states.activity_run_time.p95 (gauge) | The 95th percentile time interval, in milliseconds, between the time the activity was started and when it was closed. Shown as millisecond |
aws.states.activity_run_time.p99 (gauge) | The 99th percentile time interval, in milliseconds, between the time the activity was started and when it was closed. Shown as millisecond |
aws.states.activity_schedule_time (gauge) | The avg time interval, in milliseconds, that the activity stayed in the schedule state. Shown as millisecond |
aws.states.activity_schedule_time.maximum (gauge) | The maximum time interval, in milliseconds, that the activity stayed in the schedule state. Shown as millisecond |
aws.states.activity_schedule_time.minimum (gauge) | The minimum time interval, in milliseconds, that the activity stayed in the schedule state. Shown as millisecond |
aws.states.activity_schedule_time.p95 (gauge) | The 95th percentile time interval, in milliseconds, that the activity stayed in the schedule state. Shown as millisecond |
aws.states.activity_schedule_time.p99 (gauge) | The 99th percentile time interval, in milliseconds, that the activity stayed in the schedule state. Shown as millisecond |
aws.states.activity_time (gauge) | The average time interval, in milliseconds, between the time the activity was scheduled and when it was closed. Shown as millisecond |
aws.states.activity_time.maximum (gauge) | The maximum time interval, in milliseconds, between the time the activity was scheduled and when it was closed. Shown as millisecond |
aws.states.activity_time.minimum (gauge) | The minimum time interval, in milliseconds, between the time the activity was scheduled and when it was closed. Shown as millisecond |
aws.states.activity_time.p95 (gauge) | The 95th percentile time interval, in milliseconds, between the time the activity was scheduled and when it was closed. Shown as millisecond |
aws.states.activity_time.p99 (gauge) | The 99th percentile time interval, in milliseconds, between the time the activity was scheduled and when it was closed. Shown as millisecond |
aws.states.activities_failed (count) | The number of activities that failed. |
aws.states.activities_heartbeat_timed_out (count) | The number of activities that were timed out due to a heartbeat timeout. |
aws.states.activities_scheduled (count) | The number of activities that were scheduled. |
aws.states.activities_started (count) | The number of activities that were started. |
aws.states.activities_succeeded (count) | The number of activities that completed successfully. |
aws.states.activities_timed_out (count) | The number of activities that were timed out on close. |
The Amazon Step Functions integration does not include any events.
The Amazon Step Functions integration does not include any service checks.
Need help? Contact Datadog support.