Dead Letter Queues

Cette page n'est pas encore disponible en français, sa traduction est en cours.
Si vous avez des questions ou des retours sur notre projet de traduction actuel, n'hésitez pas à nous contacter.

Data Streams Monitoring (DSM) provides visibility into your non-empty dead letter queues (DLQs), enabling you to monitor and inspect message processing failures. DSM also enables you to remediate these message processing failures directly within Datadog.

Monitoring dead letter queues is available for Amazon SQS queues.

Monitor DLQs

Setup

Usage

Create a monitor for a dead letter queue

To track if your queue is rerouting messages to its DLQ, you can create a metric monitors that alerts on the data_streams.sqs.dead_letter_queue.messages metric.

To create a monitor for a queue’s DLQ:

  1. In Datadog, navigate to Data Streams Monitoring.
  2. Select the Explore tab (default).
  3. Click on a supported queue to open its side panel.
  4. Select the Dead Letter Queue tab.
  5. Click Create Monitor to open a monitor setup page. The default inputs are sufficient to create a monitor that alerts when your DLQ is non-empty, but you can also make additional configurations on this page if you wish.
  6. Click Create at the bottom of the page.

Detect message processing issues

Data Streams Monitoring helps you detect where messages couldn’t be processed and what downstream services could be affected:

  • The DSM Service Map highlights queues with messages in their DLQs, helping you to visually identify where failures occur

  • The DSM Issues page lists all queues that are experiencing message processing issues

Remediate DLQ issues

You can inspect and resolve non-empty DLQs directly in Datadog by using Datadog Actions.

Setup

In Datadog, create a Connection. You need an IAM entity to perform the actions. This IAM entity can be an IAM User (with a secret access key) or IAM Role (assumed by using sts:AssumeRole) and have the following permissions:

  • sqs:ReceiveMessage (for peek)
  • sqs:StartMessageMoveTask (for redrive)
  • sqs:PurgeQueue (for purge)

These permissions can be applied globally to all SQS queues, or restricted to specific queues.

Usage

After you set up the connection, you can click on a supported queue to open its side panel, where you can use the following actions:

  • Peek to inspect failed message content and identify the root cause
  • Redrive to requeue messages for another processing attempt
  • Purge to clear messages that no longer need processing

Troubleshooting

If you are unable to see dead letter queue information:

  • Confirm that you have installed the Datadog-AWS integration
  • Confirm that your AWS role uses the AWS-managed AmazonSQSReadOnlyAccess policy
  • Confirm that your role has sqs:ListQueues and sqs:GetQueueAttributes permissions