Pipelines Usage Metrics

Docs > Observability Pipelines > Monitoring and Troubleshooting > Pipelines Usage Metrics

Overview

This document lists some of the metrics available from Observability Pipelines. You can:

Create your own dashboards, notebooks, and monitors with these metrics.
Use Metrics Summary to see metadata and tags available for the metrics. You can also see which dashboards, notebooks, monitors, and SLOs are using those metrics.

See Getting Started with Tags for more information on how to use tags to group metrics by specific pipelines, Workers, and components.

Estimated usage metric

Observability Pipelines ingested bytes: Metric: datadog.estimated_usage.observability_pipelines.ingested_bytes; Description: The volume of data ingested by Observability Pipelines. See Estimated Usage Metrics for more information.

Host metrics

Uptime: Metric: pipelines.host.uptime; Description: The amount of time since the host was started, in seconds.
Bytes in: Metric: pipelines.host.network_receive_bytes_total; Description: The number of bytes received by the host on all interfaces. Use the device tag to filter per interface, for example device:eth0.
Bytes out: Metric: pipelines.host.network_transmit_bytes_total; Description: The number of bytes sent by the host on all interfaces. Use the device tag to filter per interface.

Process metrics

Uptime: Metric: pipelines.uptime_seconds; Description: The amount of time since the Worker process was started, in seconds.
CPU usage: Metric: pipelines.cpu_usage_seconds_total; Description: The amount of CPU time consumed by the Worker process in seconds (in the user and system space). The rate per second of that metric shows the proportion of the CPU used by the Worker.
Memory usage: Metric: pipelines.resident_memory_used_bytes; Description: The amount of RSS memory used by the Worker process in bytes.

Component metrics

These metrics are available for sources, processors, and destinations.

Use the component_id tag to filter or group by individual components.
Use the component_type tag to filter or group by the type of source, processor, or destination, such as quota for the Quota processor.

Events in: Metric: pipelines.component_received_events_total; Description: The number of events received by the component.; Available for: Sources, processors, and destinations.
Events out: Metric: pipelines.component_sent_events_total; Description: The number of events the component sends downstream.; Available for: Sources, processors, and destinations.
Event bytes in: Metric: pipelines.component_received_event_bytes_total; Description: The byte size of events received by the component.; Available for: Sources, processors, and destinations.
Event bytes out: Metric: pipelines.component_sent_event_bytes_total; Description: The byte size of events the component sends downstream.; Available for: Sources, processors, and destinations.
Errors: Metric: pipelines.component_errors_total; Description: The number of errors encountered by the component.; Available for: Sources, processors, and destinations.
Data dropped intentionally or unintentionally: Metric: pipelines.component_discarded_events_total; Description: The number of events dropped. Note: To break down this metric, use the intentional:true tag to filter for events that are intentionally dropped or the intentional:false tag for events that are not intentionally dropped.; Available for: Sources, processors, and destinations.
Timed out events: Metric: pipelines.component_timed_out_events_total; Description: The number of events that waited more than 5 seconds to be sent to the first processor and resulted in a HTTP 503 error. This could happen when delivery of events are blocked.; Available for: HTTP-based sources that have a configured timeout, such as the Datadog Agent.
Timed out requests: Metric: pipelines.component_timed_out_requests_total; Description: The number of requests that timed out for sources that send events to the Worker in batches using HTTP requests.; Available for: HTTP-based sources that have a configured timeout, such as the Datadog Agent.
Utilization: Metric: pipelines.utilization; Description: The component’s activity. A value of 0 indicates an idle component that is waiting for input. A value of 1 indicates a component that is never idle, which means that the component is likely a bottleneck in the processing topology that is creating backpressure, which might cause events to be dropped.; Available for: Processors and destinations.

Buffer metrics (when enabled)

Use these metrics to analyze buffer performance. All metrics are emitted on a one-second interval, unless otherwise stated.

Source buffer metrics

These metrics are specific to source buffers, located downstream of a source. Each source emits its own respective buffer metrics. Note: Source buffers are not configurable, but these metrics can help monitor backpressure as it propagates to your pipeline’s source.

Use the component_id tag to filter or group by individual components.
Use the component_type tag to filter or group by the source type, such as splunk_hec for the Splunk HEC source.

pipelines.source_buffer_utilization: Description: Event count in a source’s buffer.; Metric type: histogram
pipelines.source_buffer_utilization_level: Description: Number of events in a source’s buffer.; Metric type: gauge
pipelines.source_buffer_utilization_mean: Description: The exponentially weighted moving average (EWMA) of the number of events in the source’s buffer.; Metric type: gauge
pipelines.source_buffer_max_size_events: Description: A source buffer’s maximum event capacity.; Metric type: gauge

Processor buffer metrics

These metrics are specific to processor buffers, located upstream of a processor. Each processor emits its own respective buffer metrics. Note: Processor buffers are not configurable, but these metrics can help monitor backpressure as it propagates through your pipeline’s processors.

Use the component_id tag to filter or group by individual components.
Use the component_type tag to filter or group by the processor type, such as quota for the Quota processor.

pipelines.transform_buffer_utilization: Description: Event count in a processor’s buffer.; Metric type: histogram
pipelines.transform_buffer_utilization_level: Description: Event count in a processor’s buffer.; Metric type: gauge
pipelines.transform_buffer_utilization_mean: Description: The exponentially weighted moving average (EWMA) of the number of events in a processor’s buffer.; Metric type: gauge
pipelines.transform_buffer_max_size_events: Description: A processor buffer’s maximum event capacity.; Metric type: gauge

Destination buffer metrics

These metrics are specific to destination buffers, located upstream of a destination. Each destination emits its own respective buffer metrics.

Use the component_id tag to filter or group by individual components.
Use the component_type tag to filter or group by the destination type, such as datadog_logs for the Datadog Logs destination.

pipelines.buffer_size_events: Description: Number of events in a destination’s buffer.; Metric type: gauge
pipelines.buffer_size_bytes: Description: Number of bytes in a destination’s buffer.; Metric type: gauge
pipelines.buffer_received_events_total: Description: Events received by a destination’s buffer. Note: This metric represents the count per second and not the cumulative total, even though total is in the metric name.; Metric type: counter
pipelines.buffer_received_bytes_total: Description: Bytes received by a destination’s buffer. Note: This metric represents the count per second and not the cumulative total, even though total is in the metric name.; Metric type: counter
pipelines.buffer_sent_events_total: Description: Events sent downstream by a destination’s buffer. Note: This metric represents the count per second and not the cumulative total, even though total is in the metric name.; Metric type: counter
pipelines.buffer_sent_bytes_total: Description: Bytes sent downstream by a destination’s buffer. Note: This metric represents the count per second and not the cumulative total, even though total is in the metric name.; Metric type: counter
pipelines.buffer_discarded_events_total: Description: Events discarded by the buffer. Note: This metric represents the count per second and not the cumulative total, even though total is in the metric name.; Metric type: counter; Additional tags: intentional:true means an incoming event was dropped because the buffer was configured to drop the newest logs when it’s full. intentional:false means the event was dropped due to an error.
pipelines.buffer_discarded_bytes_total: Description: Bytes discarded by the buffer. Note: This metric represents the count per second and not the cumulative total, even though total is in the metric name.; Metric type: counter; Additional tags: intentional:true means an incoming event was dropped because the buffer was configured to drop the newest logs when it’s full. intentional:false means the event was dropped due to an error.

Deprecated buffer metrics

These metrics are still emitted by the Observability Pipelines Worker for backwards compatibility. Datadog recommends using the replacements when possible.

pipelines.buffer_events: Description: Number of events in a destination’s buffer. Use pipelines.buffer_size_events instead.; Metric type: gauge
pipelines.buffer_byte_size: Description: Number of bytes in a destination’s buffer. Use pipelines.buffer_size_bytes instead.; Metric type: gauge