- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- Administrator's Guide
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Learn to build and modify log pipelines, manage them with the Pipeline Scanner, and standardize attribute names across processed logs for consistency.
Datadog automatically parses JSON-formatted logs. For other formats, Datadog allows you to enrich your logs with the help of Grok Parser. The Grok syntax provides an easier way to parse logs than pure regular expressions. The Grok Parser enables you to extract attributes from semi-structured text messages.
Grok comes with reusable patterns to parse integers, IP addresses, hostnames, etc. These values must be sent into the grok parser as strings.
You can write parsing rules with the %{MATCHER:EXTRACT:FILTER}
syntax:
Matcher: A rule (possibly a reference to another token rule) that describes what to expect (number, word, notSpace, etc.).
Extract (optional): An identifier representing the capture destination for the piece of text matched by the Matcher.
Filter (optional): A post-processor of the match to transform it.
Example for a classic unstructured log:
john connected on 11/08/2017
With the following parsing rule:
MyParsingRule %{word:user} connected on %{date("MM/dd/yyyy"):date}
After processing, the following structured log is generated:
Note:
_
, and .
. It must start with an alphanumeric character.^
, to match the start of a string, and $
, to match the end of a string.\n
and \s+
to account for newlines and whitespace.Here is a list of all the matchers and filters natively implemented by Datadog:
date("pattern"[, "timezoneId"[, "localeId"]])
regex("pattern")
notSpace
boolean("truePattern", "falsePattern")
true
and false
, ignoring case).numberStr
number
numberExtStr
numberExt
integerStr
integer
integerExtStr
integerExt
word
_
(underscore) character; and ends with a word boundary. Equivalent to \b\w+\b
in regex.doubleQuotedString
singleQuotedString
quotedString
uuid
mac
ipv4
ipv6
ip
hostname
ipOrHost
port
data
.*
in regex. Use when none of above patterns is appropriate.number
integer
boolean
nullIf("value")
json
rubyhash
{name => "John", "job" => {"company" => "Big Company", "title" => "CTO"}}
useragent([decodeuricomponent:true/false])
querystring
?productId=superproduct&promotionCode=superpromo
).decodeuricomponent
%2Fservice%2Ftest
into /service/test
.lowercase
uppercase
keyvalue([separatorStr[, characterAllowList[, quotingStr[, delimiter]]]])
xml
csv(headers[, separator[, quotingcharacter]])
scale(factor)
array([[openCloseStr, ] separator][, subRuleOrFilter)
url
At the bottom of your Grok processor tiles, there is an Advanced Settings section:
Use the Extract from field to apply your Grok processor on a given text attribute instead of the default message
attribute.
For example, consider a log containing a command.line
attribute that should be parsed as a key-value. You could parse this log as follows:
Use the Helper Rules field to define tokens for your parsing rules. Helper rules help you to factorize Grok patterns across your parsing rules. This is useful when you have several rules in the same Grok parser that use the same tokens.
Example for a classic unstructured log:
john id:12345 connected on 11/08/2017 on server XYZ in production
Use the following parsing rule:
MyParsingRule %{user} %{connection} %{server}
With the following helpers:
user %{word:user.name} id:%{integer:user.id}
connection connected on %{date("MM/dd/yyyy"):connect_date}
server on server %{notSpace:server.name} in %{notSpace:server.env}
Some examples demonstrating how to use parsers:
This is the key-value core filter: keyvalue([separatorStr[, characterAllowList[, quotingStr[, delimiter]]]])
where:
separatorStr
: defines the separator between key and values. Defaults to =
.characterAllowList
: defines extra non-escaped value chars in addition to the default \\w.\\-_@
. Used only for non-quoted values (for example, key=@valueStr
).quotingStr
: defines quotes, replacing the default quotes detection: <>
, ""
, ''
.delimiter
: defines the separator between the different key values pairs (for example, |
is the delimiter in key1=value1|key2=value2
). Defaults to
(normal space), ,
and ;
.Use filters such as keyvalue to more-easily map strings to attributes for keyvalue or logfmt formats:
Log:
user=john connect_date=11/08/2017 id=123 action=click
Rule:
rule %{data::keyvalue}
You don’t need to specify the name of your parameters as they are already contained in the log.
If you add an extract attribute my_attribute
in your rule pattern you will see:
If =
is not the default separator between your key and values, add a parameter in your parsing rule with a separator.
Log:
user: john connect_date: 11/08/2017 id: 123 action: click
Rule:
rule %{data::keyvalue(": ")}
If logs contain special characters in an attribute value, such as /
in a url for instance, add it to the allowlist in the parsing rule:
Log:
url=https://app.datadoghq.com/event/stream user=john
Rule:
rule %{data::keyvalue("=","/:")}
Other examples:
Raw string | Parsing rule | Result |
---|---|---|
key=valueStr | %{data::keyvalue} | {“key”: “valueStr”} |
key=<valueStr> | %{data::keyvalue} | {“key”: “valueStr”} |
“key”=“valueStr” | %{data::keyvalue} | {“key”: “valueStr”} |
key:valueStr | %{data::keyvalue(":")} | {“key”: “valueStr”} |
key:"/valueStr" | %{data::keyvalue(":", "/")} | {“key”: “/valueStr”} |
/key:/valueStr | %{data::keyvalue(":", "/")} | {"/key": “/valueStr”} |
key:={valueStr} | %{data::keyvalue(":=", "", "{}")} | {“key”: “valueStr”} |
key1=value1|key2=value2 | %{data::keyvalue("=", "", "", "|")} | {“key1”: “value1”, “key2”: “value2”} |
key1=“value1”|key2=“value2” | %{data::keyvalue("=", "", "", "|")} | {“key1”: “value1”, “key2”: “value2”} |
Multiple QuotingString example: When multiple quotingstring are defined, the default behavior is replaced with a defined quoting character.
The key-value always matches inputs without any quoting characters, regardless of what is specified in quotingStr
. When quoting characters are used, the characterAllowList
is ignored as everything between the quoting characters is extracted.
Log:
key1:=valueStr key2:=</valueStr2> key3:="valueStr3"
Rule:
rule %{data::keyvalue(":=","","<>")}
Result:
{"key1": "valueStr", "key2": "/valueStr2"}
Note:
key=
) or null
values (key=null
) are not displayed in the output JSON.data
object, and this filter is not matched, then an empty JSON {}
is returned (for example, input: key:=valueStr
, parsing rule: rule_test %{data::keyvalue("=")}
, output: {}
).""
as quotingStr
keeps the default configuration for quoting.The date matcher transforms your timestamp in the EPOCH format (unit of measure millisecond).
Raw string | Parsing rule | Result |
---|---|---|
14:20:15 | %{date("HH:mm:ss"):date} | {“date”: 51615000} |
02:20:15 PM | %{date("hh:mm:ss a"):date} | {“date”: 51615000} |
11/10/2014 | %{date("dd/MM/yyyy"):date} | {“date”: 1412978400000} |
Thu Jun 16 08:29:03 2016 | %{date("EEE MMM dd HH:mm:ss yyyy"):date} | {“date”: 1466065743000} |
Tue Nov 1 08:29:03 2016 | %{date("EEE MMM d HH:mm:ss yyyy"):date} | {“date”: 1466065743000} |
06/Mar/2013:01:36:30 +0900 | %{date("dd/MMM/yyyy:HH:mm:ss Z"):date} | {“date”: 1362501390000} |
2016-11-29T16:21:36.431+0000 | %{date("yyyy-MM-dd'T'HH:mm:ss.SSSZ"):date} | {“date”: 1480436496431} |
2016-11-29T16:21:36.431+00:00 | %{date("yyyy-MM-dd'T'HH:mm:ss.SSSZZ"):date} | {“date”: 1480436496431} |
06/Feb/2009:12:14:14.655 | %{date("dd/MMM/yyyy:HH:mm:ss.SSS"):date} | {“date”: 1233922454655} |
2007-08-31 19:22:22.427 ADT | %{date("yyyy-MM-dd HH:mm:ss.SSS z"):date} | {“date”: 1188598942427} |
Thu Jun 16 08:29:03 20161 | %{date("EEE MMM dd HH:mm:ss yyyy","Europe/Paris"):date} | {“date”: 1466058543000} |
Thu Jun 16 08:29:03 20161 | %{date("EEE MMM dd HH:mm:ss yyyy","UTC+5"):date} | {“date”: 1466047743000} |
Thu Jun 16 08:29:03 20161 | %{date("EEE MMM dd HH:mm:ss yyyy","+3"):date} | {“date”: 1466054943000} |
1 Use the timezone
parameter if you perform your own localizations and your timestamps are not in UTC.
The supported format for timezones are:
GMT
, UTC
, UT
or Z
+h
, +hh
, +hh:mm
, -hh:mm
, +hhmm
, -hhmm
, +hh:mm:ss
, -hh:mm:ss
, +hhmmss
or -hhmmss
. The maximum supported range is from +18:00 to -18:00 inclusive.UTC+
, UTC-
, GMT+
, GMT-
, UT+
or UT-
. The maximum supported range is from +18:00 to -18:00 inclusive.Note: Parsing a date doesn’t set its value as the log official date. For this use the Log Date Remapper in a subsequent Processor.
If you have logs with two possible formats which differ in only one attribute, set a single rule using alternating with (<REGEX_1>|<REGEX_2>)
. This rule is equivalent to a Boolean OR.
Log:
john connected on 11/08/2017
12345 connected on 11/08/2017
Rule: Note that “id” is an integer and not a string.
MyParsingRule (%{integer:user.id}|%{word:user.firstname}) connected on %{date("MM/dd/yyyy"):connect_date}
Results:
Some logs contain values that only appear part of the time. In this case, make attribute extraction optional with ()?
.
Log:
john 1234 connected on 11/08/2017
Rule:
MyParsingRule %{word:user.firstname} (%{integer:user.id} )?connected on %{date("MM/dd/yyyy"):connect_date}
Note: A rule will not match if you include a space after the first word in the optional section.
Use the json
filter to parse a JSON object nested after a raw text prefix:
Log:
Sep 06 09:13:38 vagrant program[123]: server.1 {"method":"GET", "status_code":200, "url":"https://app.datadoghq.com/logs/pipelines", "duration":123456}
Rule:
parsing_rule %{date("MMM dd HH:mm:ss"):timestamp} %{word:vm} %{word:app}\[%{number:logger.thread_id}\]: %{notSpace:server} %{data::json}
Log:
john_1a2b3c4 connected on 11/08/2017
Rule:
MyParsingRule %{regex("[a-z]*"):user.firstname}_%{regex("[a-zA-Z0-9]*"):user.id} .*
Use the array([[openCloseStr, ] separator][, subRuleOrFilter)
filter to extract a list into an array in a single attribute. The subRuleOrFilter
is optional and accepts these filters.
Log:
Users [John, Oliver, Marc, Tom] have been added to the database
Rule:
myParsingRule Users %{data:users:array("[]",",")} have been added to the database
Log:
Users {John-Oliver-Marc-Tom} have been added to the database
Rule:
myParsingRule Users %{data:users:array("{}","-")} have been added to the database
Rule using subRuleOrFilter
:
myParsingRule Users %{data:users:array("{}","-", uppercase)} have been added to the database
Kubernetes components sometimes log in the glog
format; this example is from the Kube Scheduler item in the Pipeline Library.
Example log line:
W0424 11:47:41.605188 1 authorization.go:47] Authorization is disabled
Parsing rule:
kube_scheduler %{regex("\\w"):level}%{date("MMdd HH:mm:ss.SSSSSS"):timestamp}\s+%{number:logger.thread_id} %{notSpace:logger.name}:%{number:logger.lineno}\] %{data:msg}
And extracted JSON:
{
"level": "W",
"timestamp": 1587728861605,
"logger": {
"thread_id": 1,
"name": "authorization.go"
},
"lineno": 47,
"msg": "Authorization is disabled"
}
The XML parser transforms XML formatted messages into JSON.
Log:
<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
</book>
Rule:
rule %{data::xml}
Result:
{
"book": {
"year": "2005",
"author": "J K. Rowling",
"category": "CHILDREN",
"title": {
"lang": "en",
"value": "Harry Potter"
}
}
}
Notes:
value
attribute is generated. For example: <title lang="en">Harry Potter</title>
is converted to {"title": {"lang": "en", "value": "Harry Potter" } }
<bookstore><book>Harry Potter</book><book>Everyday Italian</book></bookstore>
is converted to { "bookstore": { "book": [ "Harry Potter", "Everyday Italian" ] } }
Use the CSV filter to more-easily map strings to attributes when separated by a given character (,
by default).
The CSV filter is defined as csv(headers[, separator[, quotingcharacter]])
where:
headers
: Defines the keys name separated by ,
. Keys names must start with alphabetical character and can contain any alphanumerical character in addition to _
.separator
: Defines separators used to separate the different values. Only one character is accepted. Default: ,
. Note: Use tab
for the separator
to represent the tabulation character for TSVs.quotingcharacter
: Defines the quoting character. Only one character is accepted. Default: "
Note:
""
within a quoted value represents "
.Log:
John,Doe,120,Jefferson St.,Riverside
Rule:
myParsingRule %{data:user:csv("first_name,name,st_nb,st_name,city")}
Result:
{
"user": {
"first_name": "John",
"name": "Doe",
"st_nb": 120,
"st_name": "Jefferson St.",
"city": "Riverside"
}
}
Other examples:
Raw string | Parsing rule | Result |
---|---|---|
John,Doe | %{data::csv("firstname,name")} | {“firstname”: “John”, “name”:“Doe”} |
"John ""Da Man""",Doe | %{data::csv("firstname,name")} | {“firstname”: “John "Da Man"”, “name”:“Doe”} |
'John ''Da Man''',Doe | %{data::csv("firstname,name",",","'")} | {“firstname”: “John ‘Da Man’”, “name”:“Doe”} |
John|Doe | %{data::csv("firstname,name","|")} | {“firstname”: “John”, “name”:“Doe”} |
value1,value2,value3 | %{data::csv("key1,key2")} | {“key1”: “value1”, “key2”:“value2”} |
value1,value2 | %{data::csv("key1,key2,key3")} | {“key1”: “value1”, “key2”:“value2”} |
value1,,value3 | %{data::csv("key1,key2,key3")} | {“key1”: “value1”, “key3”:“value3”} |
Value1 Value2 Value3 (TSV) | %{data::csv("key1,key2,key3","tab")} | {“key1”: “value1”, “key2”: “value2”, “key3”:“value3”} |
If you have a log where after you have parsed what is needed and know that the text after that point is safe to discard, you can use the data matcher to do so. For the following log example, you can use the data
matcher to discard the %
at the end.
Log:
Usage: 24.3%
Rule:
MyParsingRule Usage\:\s+%{number:usage}%{data:ignore}
Result:
{
"usage": 24.3,
"ignore": "%"
}
If your logs contain ASCII control characters, they are serialized upon ingestion. These can be handled by explicitly escaping the serialized value within your grok parser.