Amazon Simple Workflow Service

Overview

Amazon SWF helps developers build, run, and scale background jobs that have parallel or sequential steps.

Enable this integration to see in Datadog all your SWF metrics.

Setup

Installation

If you haven’t already, set up the Amazon Web Services integration first.

Metric collection

  1. In the AWS integration page, ensure that SWF is enabled under the Metric Collection tab.
  2. Install the Datadog - Amazon SWF integration.

Log collection

Enable logging

Configure Amazon SWF to send logs either to a S3 bucket or to CloudWatch.

Note: If you log to a S3 bucket, make sure that amazon_swf is set as Target prefix.

Send logs to Datadog

  1. If you haven’t already, set up the Datadog Forwarder Lambda function.

  2. Once the Lambda function is installed, manually add a trigger on the S3 bucket or CloudWatch log group that contains your Amazon SWF logs in the AWS console:

Data Collected

Metrics

aws.swf.activity_task_schedule_to_close_time
(gauge)
The time interval, in milliseconds, between the time when the activity was scheduled to when it closed.
Shown as millisecond
aws.swf.activity_task_schedule_to_close_time.maximum
(gauge)
Maximum time interval, in milliseconds, between the time when the activity was scheduled to when it closed.
Shown as millisecond
aws.swf.activity_task_schedule_to_close_time.minimum
(gauge)
Minimum time interval, in milliseconds, between the time when the activity was scheduled to when it closed.
Shown as millisecond
aws.swf.activity_task_schedule_to_start_time
(gauge)
The time interval, in milliseconds, between the time when the activity task was scheduled and when it started.
Shown as millisecond
aws.swf.activity_task_schedule_to_start_time.maximum
(gauge)
Maximum time interval, in milliseconds, between the time when the activity task was scheduled and when it started.
Shown as millisecond
aws.swf.activity_task_schedule_to_start_time.minimum
(gauge)
Minimum time interval, in milliseconds, between the time when the activity task was scheduled and when it started.
Shown as millisecond
aws.swf.activity_task_start_to_close_time
(gauge)
The time interval, in milliseconds, between the time that the decision task was started and the time it was closed.
Shown as millisecond
aws.swf.activity_task_start_to_close_time.maximum
(gauge)
Maximum time interval, in milliseconds, between the time that the decision task was started and the time it was closed.
Shown as millisecond
aws.swf.activity_task_start_to_close_time.minimum
(gauge)
Minimum time interval, in milliseconds, between the time that the decision task was started and the time it was closed.
Shown as millisecond
aws.swf.activity_tasks_canceled
(count)
The count of activity tasks that were canceled.
aws.swf.activity_tasks_completed
(count)
The count of activity tasks that completed.
aws.swf.activity_tasks_failed
(count)
The count of activity tasks that failed.
aws.swf.consumed_capacity
(count)
The count of requests per second.
Shown as request
aws.swf.decision_task_schedule_to_start_time
(gauge)
The time interval, in milliseconds, between the time that the decision task was scheduled and the time it was picked up by a worker and started.
Shown as millisecond
aws.swf.decision_task_schedule_to_start_time.maximum
(gauge)
Maximum time interval, in milliseconds, between the time that the decision task was scheduled and the time it was picked up by a worker and started.
Shown as millisecond
aws.swf.decision_task_schedule_to_start_time.minimum
(gauge)
Minimum time interval, in milliseconds, between the time that the decision task was scheduled and the time it was picked up by a worker and started.
Shown as millisecond
aws.swf.decision_task_start_to_close_time
(gauge)
The time interval, in milliseconds, between the time that the decision task was started and the time it was closed.
aws.swf.decision_task_start_to_close_time.maximum
(gauge)
Maximum time interval, in milliseconds, between the time that the decision task was started and the time it was closed.
aws.swf.decision_task_start_to_close_time.minimum
(gauge)
Minimum time interval, in milliseconds, between the time that the decision task was started and the time it was closed.
aws.swf.decision_tasks_completed
(count)
The count of decision tasks that have been completed.
aws.swf.pending_tasks
(count)
The count of pending tasks in a 1 minute interval for a specific Task List.
Shown as task
aws.swf.provisioned_bucket_size
(count)
The count of available requests per second.
Shown as request
aws.swf.provisioned_refill_rate
(count)
The count of requests per second that are allowed into the bucket.
Shown as request
aws.swf.scheduled_activity_tasks_timed_out_on_close
(count)
The count of activity tasks that were scheduled but timed out on close.
aws.swf.scheduled_activity_tasks_timed_out_on_start
(count)
The count of activity tasks that were scheduled but timed out on start.
aws.swf.started_activity_tasks_timed_out_on_close
(count)
The count of activity tasks that were started but timed out on close.
aws.swf.started_activity_tasks_timed_out_on_heartbeat
(count)
The count of activity tasks that were started but timed out due to a heartbeat timeout.
aws.swf.started_decision_tasks_timed_out_on_close
(count)
The count of decision tasks that started but timed out on closing.
aws.swf.throttled_events
(count)
The count of requests that have been throttled.
Shown as request
aws.swf.workflow_start_to_close_time
(gauge)
The time, in milliseconds, between the time the workflow started and the time it closed.
Shown as millisecond
aws.swf.workflow_start_to_close_time.maximum
(gauge)
Maximum time, in milliseconds, between the time the workflow started and the time it closed.
Shown as millisecond
aws.swf.workflow_start_to_close_time.minimum
(gauge)
Minimum time, in milliseconds, between the time the workflow started and the time it closed.
Shown as millisecond
aws.swf.workflows_canceled
(count)
The count of workflows that were canceled.
aws.swf.workflows_completed
(count)
The count of workflows that were completed.
aws.swf.workflows_continued_as_new
(count)
The count of workflows that continued as new.
aws.swf.workflows_failed
(count)
The count of workflows that failed.
aws.swf.workflows_terminated
(count)
The count of workflows that were terminated.
aws.swf.workflows_timed_out
(count)
The count of workflows that timed out, for any reason.

Each of the metrics retrieved from AWS are assigned the same tags that appear in the AWS console, including but not limited to host name, security-groups, and more.

Events

The Amazon SWF integration does not include any events.

Service Checks

The Amazon SWF integration does not include any service checks.

Troubleshooting

Need help? Contact Datadog support.