Overview

If you experience unexpected behavior with Datadog Observability Pipelines (OP), there are a few common issues you can investigate, and this guide may help resolve issues quickly. If you continue to have trouble, reach out to Datadog support for further assistance.

View Observability Pipelines Worker stats and logs

To view information about the Observability Pipelines Workers running for an active pipeline:

  1. Navigate to Observability Pipelines.
  2. Select your pipeline.
  3. Click the Workers tab to see the Workers’ memory and CPU utilization, traffic stats, and any errors.
  4. To view the Workers’ statuses and versions, click the Latest Deployment & Setup tab.
  5. To see the Workers’ logs, click the cog at the top right side of the page, then select View OPW Logs. See Logs Search Syntax for details on how to filter your logs. To see logs for a specific Worker, add @op_work.id:<worker_id> to the search query.

Inspect events sent through your pipeline to identify setup issues

If you can access your Observability Pipelines Workers locally, use the tap command to see the raw data sent through your pipeline’s source and processors.

Enable the Observability Pipelines Worker API

The Observability Pipelines Worker API allows you to interact with the Worker’s processes with the tap command. If you are using the Helm charts provided when you set up a pipeline, then the API has already been enabled. Otherwise, make sure the environment variable DD_OP_API_ENABLED is set to true in /etc/observability-pipelines-worker/bootstrap.yaml. See Bootstrap options for more information. This sets up the API to listen on localhost and port 8686, which is what the CLI for tap is expecting.

Use top to find the component ID

You need the source’s or processor’s component ID to tap into it. Use the top command to find the ID of the component you want to tap into:

observability-pipelines-worker top

Use tap to see your data

If you are on the same host as the Worker, run the following command to tap the output of the component:

observability-pipelines-worker tap <component_ID>

If you are using a containerized environment, use the docker exec or kubectl exec command to get a shell into the container to run the above tap command.

Seeing delayed logs at the destination

Observability Pipelines destinations batch events before sending them to the downstream integration. For example, the Amazon S3, Google Cloud Storage, and Azure Storage destinations have a batch timeout of 900 seconds. If the other batch parameters (maximum events and maximum bytes) have not been met within the 900-second timeout, the batch is flushed at 900 seconds. This means the destination component can take up to 15 minutes to send out a batch of events to the downstream integration.

These are the batch parameters for each destination:

DestinationMaximum EventsMaximum BytesTimeout (seconds)
Amazon OpenSearchNone10,000,0001
Amazon S3 (Datadog Log Archives)None100,000,000900
Azure Storage (Datadog Log Archives)None100,000,000900
Datadog Logs1,0004,250,0005
ElasticsearchNone10,000,0001
Google ChronicleNone1,000,00015
Google Cloud Storage (Datadog Log Archives)None100,000,000900
OpenSearchNone10,000,0001
Splunk HTTP Event Collector (HEC)None1,000,0001
Sumo Logic Hosted CollecterNone10,000,0001

Note: The rsyslog and syslog-ng destinations do not batch events.

See event batching for more information.