이 페이지는 아직 영어로 제공되지 않습니다. 번역 작업 중입니다.
현재 번역 프로젝트에 대한 질문이나 피드백이 있으신 경우 언제든지 연락주시기 바랍니다.
This feature is available for Agent version 7.65.0+ and above. For older Agent versions or to explicitly enable the legacy implementation, see Auto Multi-line Detection and Aggregation (Legacy).

Overview

Automatic multi-line detection allows the Agent to detect and aggregate common multi-line logs automatically.

Getting started

To enable the Auto multi-line feature in your Agent configuration, set auto_multi_line_detection to true in your configuration file, or set the DD_LOGS_CONFIG_AUTO_MULTI_LINE_DETECTION=true environment variable:

logs_config:
  auto_multi_line_detection: true
DD_LOGS_CONFIG_AUTO_MULTI_LINE_DETECTION=true

Default settings

By default, the following features are enabled:

  • enable_datetime_detection: This configures automatic datetime aggregation. Logs beginning with a datetime format are used to aggregate logs.
  • enable_json_detection: This configures JSON detection and rejection. JSON-structured logs are never aggregated.

You can disable these features by setting the following to false in your configuration file or in your environment variable:

logs_config:
  auto_multi_line:
    enable_datetime_detection: false
    enable_json_detection: false
DD_LOGS_CONFIG_AUTO_MULTI_LINE_ENABLE_DATETIME_DETECTION=false
DD_LOGS_CONFIG_AUTO_MULTI_LINE_ENABLE_JSON_DETECTION=false

Enable multi-line aggregation per integration

You can enable or disable multi-line aggregation for a specific integration’s log collection:

logs:
  - type: file
    path: /my/test/file.log
    service: testApp
    source: java
    auto_multi_line_detection: false

Supported datetime formats

Auto multi-line detection uses an algorithm to detect any datetime format that occurs in the first 60 bytes of a log line. To prevent false positives, the algorithm requires enough context to consider a datetime format a match.

Your datetime format must include both a date and time component to be detected.

Examples of valid formats that include enough context to be detected:

  • 2021-03-28 13:45:30
  • 2023-03-28T14:33:53.743350Z
  • Jun 14 15:16:01
  • 2024/05/16 19:46:15

Examples of formats that do not have enough context to be detected:

  • 12:30:2017
  • 12:30:20
  • 2024/05/16

Custom pattern configuration

If datetime aggregation is insufficient or your format is too short to be detected automatically, you can customize the feature in two ways:

Custom samples

A custom sample is a sample of a log on which you want to aggregate. For example, if you want to aggregate a stack trace, the first line of the stack trace would be a good sample to provide. Custom samples are an easier way to aggregate logs than regex patterns.

To configure custom samples, you can use the logs_config in your datadog.yaml file or set an environment variable. In the following example, the multi-line detection is looking for the sample "SEVERE Main main Exception occurred":

logs_config:
  auto_multi_line_detection_custom_samples:
    - sample: "SEVERE Main main Exception occurred"
DD_LOGS_CONFIG_AUTO_MULTI_LINE_DETECTION_CUSTOM_SAMPLES='[{"sample": "SEVERE Main main Exception occurred"}]'

This aggregates logs where "SEVERE Main main Exception occurred" matches the first line. For example:

SEVERE Main main Exception occurred
java.lang.Exception: Something bad happened!
    at Main.funcd(Main.java:50)
    at Main.funcc(Main.java:49)
    at Main.funcb(Main.java:48)
    at Main.funca(Main.java:47)
    at Main.main(Main.java:29)

How custom samples work

Custom samples tokenize the first 60 bytes of a log line and also tokenize the provided sample. Tokens include

  • words and their length
  • whitespace
  • numbers and their length
  • special characters
  • datetime components.

Each log token is compared to each token in the sample. If 75% of the log’s tokens match the sample’s, the log is marked for aggregation. Datadog recommends using sample-based matching if your logs have a stable format. If you need more flexible matching, you can use regex.

Regex patterns

Regex patterns work similarly to a multi_line rule. If the regex pattern matches the log, it is used for aggregation.

To configure custom regex patterns, you can use the logs_config in your datadog.yaml file or set an environment variable.

logs_config:
  auto_multi_line_detection_custom_samples:
    - regex: "\\[\\w+\\] Main main Exception occurred"
DD_LOGS_CONFIG_AUTO_MULTI_LINE_DETECTION_CUSTOM_SAMPLES='[{"regex": "\\[\\w+\\] Main main Exception occurred"}]'

You can mix samples and regex patterns to support multiple log formats:

logs_config:
  auto_multi_line_detection_custom_samples:
    - sample: "CORE | INFO | (pkg/logs/"
    - regex: "\\d{4}dog.\\s\\w+"
    - sample: "[ERR] Exception"
      label: no_aggregate
DD_LOGS_CONFIG_AUTO_MULTI_LINE_DETECTION_CUSTOM_SAMPLES='[
  {"sample": "CORE | INFO | (pkg/logs/"},
  {"regex": "\\d{4}dog.\\s\\w+"},
  {"sample": "[ERR] Exception", "label": "no_aggregate"}
]'

Note: Existing auto_multi_line_extra_patterns configurations are automatically supported when migrating from V1.

Advanced customization

Auto multi-line detection uses a labeled aggregation system to aggregate logs. The detection step assigns a label to each log, and the aggregation step aggregates logs based on those labels.

Labels

start_group
Defines beginning of a multi-line log
- Flushes any buffered multi-line log, if present
- Starts a new multi-line log
- Only one multi-line log can be buffered at a time
aggregate
Is added to existing multi-line log
- If no multi-line log exists, flushes immediately
- Default label when nothing else matches
no_aggregate
Declares logs that are never part of aggregation
- Flushes buffered multi-line log, if present
- Flushes sample immediately
- Used for JSON logs

Label configuration

You can provide custom labels to each regex or sample to change the aggregation behavior based on the label rules. This is useful if you want to explicitly include or exclude certain log formats in multi-line aggregation.

logs_config:
  auto_multi_line_detection_custom_samples:
    # Never aggregate these formats
    - sample: "some service we should not aggregate"
      label: no_aggregate
    - regex: \w*\s(data|dog)
      label: no_aggregate
DD_LOGS_CONFIG_AUTO_MULTI_LINE_DETECTION_CUSTOM_SAMPLES='[
  {"sample": "some service we should not aggregate", "label": "no_aggregate"},
  {"regex": "\\w*\\s(data|dog)", "label": "no_aggregate"}
]'

Monitoring and debugging

You can search for multiline logs or truncated logs by enabling the following settings:

logs_config:
  tag_multi_line_logs: true
  tag_truncated_logs: true

These settings add the following tags to your logs, allowing you to search for them in the logs explorer:

  • multiline: Shows the aggregation source (for example, auto_multiline, multiline_regex)
  • truncated: Shows truncation source (for example, single_line, multi_line)

Note: The Agent truncates logs that are too long to process. If a line is too long before multiline aggregation, the Agent assigns it the single_line tag. If an incorrect pattern causes a log to overflow the aggregation buffer, the Agent applies the multi_line tag.

Configuration reference

SettingTypeDefaultDescription
logs_config.auto_multi_line_detection_custom_samplesObjectEmptyCustom samples/regex patterns
logs_config.auto_multi_line.enable_json_detectionBoolTrueEnable JSON detection & rejection
logs_config.auto_multi_line.enable_datetime_detectionBoolTrueEnable datetime detection
logs_config.auto_multi_line.timestamp_detector_match_thresholdFloat0.5Timestamp matching threshold
logs_config.auto_multi_line.tokenizer_max_input_bytesInt60Bytes to tokenize

Further reading