Datadog automatically parses JSON-formatted logs. When logs are not JSON-formatted, you can add values to your raw logs by sending them through a processing pipeline.
With pipelines, logs are parsed and enriched by chaining them sequentially through processors. This extracts meaningful information or attributes from semi-structured text to reuse as facets.
Each log that comes through the pipelines is tested against every pipeline filter. If it matches one then all the processors are applied sequentially before moving to the next pipeline.
So for instance a processing pipeline can transform this log:
into this log:
With one single pipeline:
Pipelines take logs from a wide variety of formats and translate them into a common format in Datadog.
For instance, a first pipeline can be defined to extract application log prefix and then each team is free to define their own pipeline to process the rest of the log message.
Filters let you limit what kinds of logs a pipeline applies to.
The filter syntax is the same as the search bar.
Note: The pipeline filtering is applied before any of the pipeline’s processors, hence you cannot filter on an attribute that is extracted in the pipeline itself.
The logstream shows which logs your pipeline applies to:
Nested pipelines are pipelines within a pipeline. Use nested pipelines to split the processing into two steps. For example, first use a high-level filtering such as team and then a second level of filtering based on the integration, service, or any other tag or attribute.
A pipeline can contain Nested pipelines and processors whereas a nested pipeline can only contain Processors.
It is possible to drag and drop a pipeline into another pipeline to transform it into a nested pipeline:
Preprocessing of JSON logs occurs before logs enter pipeline processing. Preprocessing runs a series of operations based on reserved attributes, such as timestamp
, status
, host
, service
, and message
. If you have different attribute names in your JSON logs, use preprocessing to map your log attribute names to those in the reserved attribute list.
With preprocessing:
For example, consider a service that generates this log:
{
"myhost": "host123",
"myapp": "test-web-2",
"logger_severity": "Error",
"log": "cannot establish connection with /api/v1/test",
"status_code": 500
}
JSON log preprocessing comes with a default configuration that works for standard log forwarders. Edit this configuration to adapt custom or specific log forwarding approaches.
Open Pre processing for JSON logs and change the default mapping:
This produces the following log:
Note: Preprocessing JSON logs is the only way to define one of your log attributes as host
for your logs.
If a JSON formatted log file includes the ddsource
attribute, Datadog interprets its value as the log’s source. To use the same source names Datadog uses, see the Integration Pipeline Library.
Note: Logs coming from a containerized environment require the use of an environment variable to override the default source and service values.
Using the Datadog Agent or the RFC5424 format automatically sets the host value on your logs. However, if a JSON formatted log file includes the following attribute, Datadog interprets its value as the log’s host:
host
hostname
syslog.hostname
By default Datadog generates a timestamp and appends it in a date attribute when logs are received. However, if a JSON formatted log file includes one of the following attributes, Datadog interprets its value as the log’s official date:
@timestamp
timestamp
_timestamp
Timestamp
eventTime
date
published_date
syslog.timestamp
Specify alternate attributes to use as the source of a log’s date by setting a log date remapper processor.
Note: Datadog rejects a log entry if its official date is older than 18 hours in the past.
By default, Datadog ingests the message value as the body of the log entry. That value is then highlighted and displayed in the logstream, where it is indexed for full text search.
Specify alternate attributes to use as the source of a log’s message by setting a log message remapper processor.
Each log entry may specify a status level which is made available for faceted search within Datadog. However, if a JSON formatted log file includes one of the following attributes, Datadog interprets its value as the log’s official status:
status
severity
level
syslog.severity
To remap a status existing in the status
attribute, use the log status remapper.
Using the Datadog Agent or the RFC5424 format automatically sets the service value on your logs. However, if a JSON formatted log file includes the following attribute, Datadog interprets its value as the log’s service:
service
syslog.appname
Specify alternate attributes to use as the source of a log’s service by setting a log service remapper processor.
By default, Datadog tracers can automatically inject trace and span IDs into your logs. However, if a JSON formatted log includes the following attributes, Datadog interprets its value as the log’s trace_id
:
dd.trace_id
contextMap.dd.trace_id
Specify alternate attributes to use as the source of a log’s trace ID by setting a trace ID remapper processor.
Integration processing pipelines are available for certain sources when they are set up to collect logs. For integration logs, an integration pipeline is automatically installed that takes care of parsing your logs and adds the corresponding facet in your Logs Explorer.
These pipelines are read-only and parse out your logs in ways appropriate for the particular source. To edit an integration pipeline, clone it and then edit the clone:
See the ELB logs example below:
To see the full list of integration pipelines that Datadog offers, browse the integration pipeline library. The pipeline library shows how Datadog processes different log formats by default.
To use an integration pipeline, Datadog recommends installing the integration by configuring the corresponding log source
. Once Datadog receives the first log with this source, the installation is automatically triggered and the integration pipeline is added to the processing pipelines list. To configure the log source, refer to the corresponding integration documentation.
It’s also possible to copy an integration pipeline using the copy button.
Additional helpful documentation, links, and articles: