The Service Map for APM is here!

Pipelines

original log

Pipelines Goal

A processing Pipeline takes a filtered subset of incoming logs and applies over them a list of sequential Processors.

Datadog automatically parses JSON-formatted logs. When your logs are not JSON-formatted, Datadog enables you to add value to your raw logs by sending them through a processing pipeline.

With pipelines, you parse and enrich your logs by chaining them sequentially through processors. This lets you extract meaningful information or attributes from semi-structured text to reuse them as facets.

Each log that comes through the Pipelines is tested against every Pipeline filter. If it matches one then all the processors are applied sequentially before moving to the next pipeline.

So for instance a processing Pipeline can transform this log:

original log

into this log:

 Log post severity

With one single pipeline:

Pipelines example

Pipelines take logs from a wide variety of formats and translate them into a common format in Datadog.

For instance, a first Pipeline can be defined to extract application log prefix and then each team is free to define their own Pipeline to process the rest of the log message.

Pipeline filters

Filters let you limit what kinds of logs a Pipeline applies to.

The filter syntax is the same as the search bar.

Be aware that the Pipeline filtering is applied before any of the pipeline’s Processors, hence you cannot filter on an attribute that is extracted in the Pipeline itself

The logstream shows which logs your Pipeline applies to:

Pipelines filters

Special Pipelines

Reserved attribute Pipeline

Datadog has a list of reserved attributes such as timestamp, status, host, service, and even the log message, those attributes have a specific behavior within Datadog. If you have different attribute names for those in your JSON logs, use the reserved attribute Pipeline to remap your logs attribute to one of the reserved attribute list.

For example: A service that generates the below logs:

{
    "myhost": "host123",
    "myapp": "test-web-2",
    "logger_severity": "Error",
    "log": "cannot establish connection with /api/v1/test",
    "status_code": 500
}

Going into the reserved attribute Pipeline and changing the default mapping to this one:

Reserved attribute remapper

Would then produce the following log:

Log post remapping

If you want to remap an attribute to one of the reserved attributes in a custom Pipeline, use the Log Status Remapper or the Log Date Remapper.

Integration Pipelines

Datadog’s integration processing Pipelines are automatically enabled and parse out your logs in appropriate and useful ways. As a result, you can get maximum value from many logs without any manual setup. Note that these Pipelines are read-only, but you can clone them and then edit the clone:

Cloning pipeline

Pipelines limitations

To make sure the Log Management solution functions in an optimal way, we set the following technical limits and rules to your log events, as well as to some product features. These have been designed so that you may never reach them.

Limits applied to ingested log events

  • The size of a log event should not exceed 25K bytes.
  • Log events can be submitted up to 6h in the past and 2h in the future.
  • A log event once converted to JSON format should contain less than 256 attributes. Each of those attribute’s key should be less than 50 characters, be nested in less than 10 successive levels, and their respective value should be less than 1024 characters if promoted as a facet.
  • A log event should not have more than 100 tags and each tag should not exceed 256 characters for a maximum of 10 million unique tags per day.

Log events which do not comply with these limits might be transformed or truncated by the system-or simply not indexed if outside of the provided time range. However, Datadog always tries to do its best to preserve as much as possible to preserve provided user data.

Limits applied to provided features

  • The maximum number of facets is 100.
  • The maximum number of processing Pipeline on a platform is 100.
  • The maximum number of Processor per Pipeline is 20.
  • The maximum number of parsing rule within a grok Processor is 10. We reserve the right to disable underperforming parsing rules that might impact our service performance.

Contact support if you reach one of these limits as Datadog might be able to provide you more.

Further Reading