---
title: Parsing
description: Parse your logs using the Grok Processor
breadcrumbs: Docs > Log Management > Log Configuration > Parsing
---

> For the complete documentation index, see [llms.txt](https://docs.datadoghq.com/llms.txt).

# Parsing

{% callout %}
##### Try Grok parsing in the Learning Center

Learn to build and modify log pipelines, manage them with the Pipeline Scanner, and standardize attribute names across processed logs for consistency.

[ENROLL NOW](https://learn.datadoghq.com/courses/log-pipelines)
{% /callout %}

## Overview{% #overview %}

Datadog automatically parses JSON-formatted logs. For other formats, Datadog allows you to enrich your logs with the help of Grok Parser. The Grok syntax provides an easier way to parse logs than pure regular expressions. The Grok Parser enables you to extract attributes from semi-structured text messages.

Grok comes with reusable patterns to parse integers, IP addresses, hostnames, etc. These values must be sent into the grok parser as strings.

You can write parsing rules with the `%{MATCHER:EXTRACT:FILTER}` syntax:

- **Matcher**: A rule (possibly a reference to another token rule) that describes what to expect (number, word, notSpace, etc.).

- **Extract** (optional): An identifier representing the capture destination for the piece of text matched by the *Matcher*.

- **Filter** (optional): A post-processor of the match to transform it.

Example for a classic unstructured log:

```text
john connected on 11/08/2017
```

With the following parsing rule:

```text
MyParsingRule %{word:user} connected on %{date("MM/dd/yyyy"):date}
```

After processing, the following structured log is generated:

```json
{
  "user": "john",
  "date": 1575590400000
}
```

**Note**:

- If you have multiple parsing rules in a single Grok parser:
  - Only one can match any given log. The first one that matches, from top to bottom, is the one that does the parsing.
  - Each rule can reference parsing rules defined above itself in the list.
- You must have unique rule names within the same Grok parser.
- The rule name must contain only: alphanumeric characters, `_`, and `.`. It must start with an alphanumeric character.
- Properties with null or empty values are not displayed.
- You must define your parsing rule to match the entire log entry, as each rule applies from the beginning to the end of the log.
- Certain logs can produce large gaps of whitespace. Use `\n` and `\s+` to account for newlines and whitespace.

### Matcher and filter{% #matcher-and-filter %}

{% alert level="danger" %}
Grok parsing features available at *query-time* (in the [Log Explorer](https://docs.datadoghq.com/logs/explorer/calculated_fields.md)) support a limited subset of matchers (**data**, **integer**, **notSpace**, **number**, and **word**) and filters (**number** and **integer**).  The following full set of matchers and filters are specific to *ingest-time* [Grok Parser](https://docs.datadoghq.com/logs/log_configuration/processors/grok_parser.md) functionality.
{% /alert %}

Here is a list of all the matchers and filters natively implemented by Datadog:

{% tab title="Matchers" %}
**Query-time and ingest-time matchers:**

The following matchers are available for both query-time parsing (Log Explorer) and ingest-time parsing (Grok Parser):

{% dl %}

{% dt %}
`word`
{% /dt %}

{% dd %}
Matches a *word*, which starts with a word boundary; contains characters from a-z, A-Z, 0-9, including the `_` (underscore) character; and ends with a word boundary. Equivalent to `\b\w+\b` in regex.
{% /dd %}

{% dt %}
`notSpace`
{% /dt %}

{% dd %}
Matches any string until the next space.
{% /dd %}

{% dt %}
`number`
{% /dt %}

{% dd %}
Matches a decimal floating point number and parses it as a double precision number.
{% /dd %}

{% dt %}
`integer`
{% /dt %}

{% dd %}
Matches an integer number and parses it as an integer number.
{% /dd %}

{% dt %}
`data`
{% /dt %}

{% dd %}
Matches any string including spaces and newlines. Equivalent to `.*` in regex. Use when none of above patterns is appropriate.
{% /dd %}

{% /dl %}

**Ingest-time only matchers:**

The following matchers are only available for ingest-time parsing with the Grok Parser processor and cannot be used in the Log Explorer:

{% dl %}

{% dt %}
`date("pattern"[, "timezoneId"[, "localeId"]])`
{% /dt %}

{% dd %}
Matches a date with the specified pattern and parses to produce a Unix timestamp. See the date Matcher examples.
{% /dd %}

{% dt %}
`regex("pattern")`
{% /dt %}

{% dd %}
Matches a regex. Check the regex Matcher examples.
{% /dd %}

{% dt %}
`boolean("truePattern", "falsePattern")`
{% /dt %}

{% dd %}
Matches and parses a Boolean, optionally defining the true and false patterns (defaults to `true` and `false`, ignoring case).
{% /dd %}

{% dt %}
`numberStr`
{% /dt %}

{% dd %}
Matches a decimal floating point number and parses it as a string.
{% /dd %}

{% dt %}
`numberExtStr`
{% /dt %}

{% dd %}
Matches a floating point number (with scientific notation support) and parses it as a string.
{% /dd %}

{% dt %}
`numberExt`
{% /dt %}

{% dd %}
Matches a floating point number (with scientific notation support) and parses it as a double precision number.
{% /dd %}

{% dt %}
`integerStr`
{% /dt %}

{% dd %}
Matches an integer number and parses it as a string.
{% /dd %}

{% dt %}
`integerExtStr`
{% /dt %}

{% dd %}
Matches an integer number (with scientific notation support) and parses it as a string.
{% /dd %}

{% dt %}
`integerExt`
{% /dt %}

{% dd %}
Matches an integer number (with scientific notation support) and parses it as an integer number.
{% /dd %}

{% dt %}
`doubleQuotedString`
{% /dt %}

{% dd %}
Matches a double-quoted string.
{% /dd %}

{% dt %}
`singleQuotedString`
{% /dt %}

{% dd %}
Matches a single-quoted string.
{% /dd %}

{% dt %}
`quotedString`
{% /dt %}

{% dd %}
Matches a double-quoted or single-quoted string.
{% /dd %}

{% dt %}
`uuid`
{% /dt %}

{% dd %}
Matches a UUID.
{% /dd %}

{% dt %}
`mac`
{% /dt %}

{% dd %}
Matches a MAC address.
{% /dd %}

{% dt %}
`ipv4`
{% /dt %}

{% dd %}
Matches an IPV4.
{% /dd %}

{% dt %}
`ipv6`
{% /dt %}

{% dd %}
Matches an IPV6.
{% /dd %}

{% dt %}
`ip`
{% /dt %}

{% dd %}
Matches an IP (v4 or v6).
{% /dd %}

{% dt %}
`hostname`
{% /dt %}

{% dd %}
Matches a hostname.
{% /dd %}

{% dt %}
`ipOrHost`
{% /dt %}

{% dd %}
Matches a hostname or IP.
{% /dd %}

{% dt %}
`port`
{% /dt %}

{% dd %}
Matches a port number.
{% /dd %}

{% /dl %}

{% /tab %}

{% tab title="Filters" %}
**Query-time and ingest-time filters:**

The following filters are available for both query-time parsing (Log Explorer) and ingest-time parsing (Grok Parser):

{% dl %}

{% dt %}
`number`
{% /dt %}

{% dd %}
Parses a match as double precision number.
{% /dd %}

{% dt %}
`integer`
{% /dt %}

{% dd %}
Parses a match as an integer number.
{% /dd %}

{% /dl %}

**Ingest-time only filters:**

The following filters are only available for ingest-time parsing with the Grok Parser processor and cannot be used in the Log Explorer:

{% dl %}

{% dt %}
`boolean`
{% /dt %}

{% dd %}
Parses 'true' and 'false' strings as booleans ignoring case.
{% /dd %}

{% dt %}
`nullIf("value")`
{% /dt %}

{% dd %}
Returns null if the match is equal to the provided value.
{% /dd %}

{% dt %}
`json`
{% /dt %}

{% dd %}
Parses properly formatted JSON.
{% /dd %}

{% dt %}
`rubyhash`
{% /dt %}

{% dd %}
Parses a properly formatted Ruby hash such as `{name => "John", "job" => {"company" => "Big Company", "title" => "CTO"}}`
{% /dd %}

{% dt %}
`useragent([decodeuricomponent:true/false])`
{% /dt %}

{% dd %}
Parses a user-agent and returns a JSON object that contains the device, OS, and the browser represented by the Agent. [Check the User Agent processor](https://docs.datadoghq.com/logs/log_configuration/processors/user_agent_parser.md).
{% /dd %}

{% dt %}
`querystring`
{% /dt %}

{% dd %}
Extracts all the key-value pairs in a matching URL query string (for example, `?productId=superproduct&promotionCode=superpromo`).
{% /dd %}

{% dt %}
`decodeuricomponent`
{% /dt %}

{% dd %}
Decodes URI components. For instance, it transforms `%2Fservice%2Ftest` into `/service/test`.
{% /dd %}

{% dt %}
`lowercase`
{% /dt %}

{% dd %}
Returns the lower-cased string.
{% /dd %}

{% dt %}
`uppercase`
{% /dt %}

{% dd %}
Returns the upper-cased string.
{% /dd %}

{% dt %}
`keyvalue([separatorStr[, characterAllowList[, quotingStr[, delimiter]]]])`
{% /dt %}

{% dd %}
Extracts the key value pattern and returns a JSON object. See the key-value filter examples.
{% /dd %}

{% dt %}
`xml`
{% /dt %}

{% dd %}
Parses properly formatted XML. See the XML filter examples.
{% /dd %}

{% dt %}
`csv(headers[, separator[, quotingcharacter]])`
{% /dt %}

{% dd %}
Parses properly formatted CSV or TSV lines. See the CSV filter examples.
{% /dd %}

{% dt %}
`scale(factor)`
{% /dt %}

{% dd %}
Multiplies the expected numerical value by the provided factor.
{% /dd %}

{% dt %}
`array([[openCloseStr, ] separator][, subRuleOrFilter)`
{% /dt %}

{% dd %}
Parses a string sequence of tokens and returns it as an array. See the list to array example.
{% /dd %}

{% dt %}
`url`
{% /dt %}

{% dd %}
Parses a URL and returns all the tokenized members (domain, query params, port, etc.) in a JSON object. [More info on how to parse URLs](https://docs.datadoghq.com/logs/log_configuration/processors/url_parser.md).
{% /dd %}

{% /dl %}

{% /tab %}

## Advanced settings{% #advanced-settings %}

Use the Advanced Settings section at the bottom of your Grok processor to parse a specific attribute instead of the default `message` attribute, or to define helper rules that reuse common patterns across multiple parsing rules.

### Parsing a specific text attribute{% #parsing-a-specific-text-attribute %}

Use the Extract from field to apply your Grok processor on a given text attribute instead of the default `message` attribute.

For example, consider a log containing a `command.line` attribute that should be parsed as a key-value. Extract from `command.line` to parse its contents and create structured attributes from the command data.

{% image
   source="https://docs.dd-static.net/images/logs/processing/parsing/grok_advanced_settings_extract.9914bce2a60d4cb8e07ddfa498f189f9.png?auto=format&fit=max&w=850 1x, https://docs.dd-static.net/images/logs/processing/parsing/grok_advanced_settings_extract.9914bce2a60d4cb8e07ddfa498f189f9.png?auto=format&fit=max&w=850&dpr=2 2x"
   alt="Advanced Settings with Extract from command.line attribute example" /%}

### Using helper rules to reuse common patterns{% #using-helper-rules-to-reuse-common-patterns %}

Use the Helper Rules field to define tokens for your parsing rules. Helper rules let you reuse common Grok patterns across your parsing rules. This is useful when you have several rules in the same Grok parser that use the same tokens.

Example for a classic unstructured log:

```text
john id:12345 connected on 11/08/2017 on server XYZ in production
```

Use the following parsing rule:

```text
MyParsingRule %{user} %{connection} %{server}
```

With the following helpers:

```text
user %{word:user.name} id:%{integer:user.id}
connection connected on %{date("MM/dd/yyyy"):connect_date}
server on server %{notSpace:server.name} in %{notSpace:server.env}
```

## Examples{% #examples %}

Some examples demonstrating how to use parsers:

- Key value or logfmt
- Parsing dates
- Alternating patterns
- Optional attribute
- Nested JSON
- Regex
- List and Arrays
- Glog format
- XML
- CSV

### Key value or logfmt{% #key-value-or-logfmt %}

This is the key-value core filter: `keyvalue([separatorStr[, characterAllowList[, quotingStr[, delimiter]]]])` where:

- `separatorStr`: defines the separator between key and values. Defaults to `=`.
- `characterAllowList`: defines extra non-escaped value chars in addition to the default `\\w.\\-_@`. Used only for non-quoted values (for example, `key=@valueStr`).
- `quotingStr`: defines quotes, replacing the default quotes detection: `<>`, `""`, `''`.
- `delimiter`: defines the separator between the different key values pairs (for example, `|`is the delimiter in `key1=value1|key2=value2`). Defaults to (normal space), `,` and `;`.

Use filters such as `keyvalue` to more-easily map strings to attributes for keyvalue or logfmt formats:

**Log:**

```text
user=john connect_date=11/08/2017 id=123 action=click
```

**Rule:**

```text
rule %{data::keyvalue}
```

You don't need to specify the name of your parameters as they are already contained in the log. If you add an **extract** attribute `my_attribute` in your rule pattern you will see:

```json
{
  "my_attribute": {
    "user": "john",
    "id": 123,
    "action": "click"
  }
}
```

If `=` is not the default separator between your key and values, add a parameter in your parsing rule with a separator.

**Log:**

```text
user: john connect_date: 11/08/2017 id: 123 action: click
```

**Rule:**

```text
rule %{data::keyvalue(": ")}
```

If logs contain special characters in an attribute value, such as `/` in a url for instance, add it to the allowlist in the parsing rule:

**Log:**

```text
url=https://app.datadoghq.com/event/stream user=john
```

**Rule:**

```text
rule %{data::keyvalue("=","/:")}
```

Other examples:

| **Raw string**              | **Parsing rule**                      | **Result**                           |
| --------------------------- | ------------------------------------- | ------------------------------------ |
| key=valueStr                | `%{data::keyvalue}`                   | {"key": "valueStr"}                  |
| key=<valueStr>              | `%{data::keyvalue}`                   | {"key": "valueStr"}                  |
| "key"="valueStr"            | `%{data::keyvalue}`                   | {"key": "valueStr"}                  |
| key:valueStr                | `%{data::keyvalue(":")}`              | {"key": "valueStr"}                  |
| key:"/valueStr"             | `%{data::keyvalue(":", "/")}`         | {"key": "/valueStr"}                 |
| /key:/valueStr              | `%{data::keyvalue(":", "/")}`         | {"/key": "/valueStr"}                |
| key:={valueStr}             | `%{data::keyvalue(":=", "", "{}")}`   | {"key": "valueStr"}                  |
| key1=value1|key2=value2     | `%{data::keyvalue("=", "", "", "|")}` | {"key1": "value1", "key2": "value2"} |
| key1="value1"|key2="value2" | `%{data::keyvalue("=", "", "", "|")}` | {"key1": "value1", "key2": "value2"} |

**Multiple QuotingString example**: When multiple quotingstring are defined, the default behavior is replaced with a defined quoting character. The key-value always matches inputs without any quoting characters, regardless of what is specified in `quotingStr`. When quoting characters are used, the `characterAllowList` is ignored as everything between the quoting characters is extracted.

**Log:**

```text
key1:=valueStr key2:=</valueStr2> key3:="valueStr3"
```

**Rule:**

```text
rule %{data::keyvalue(":=","","<>")}
```

**Result:**

```json
{"key1": "valueStr", "key2": "/valueStr2"}
```

**Note**:

- Empty values (`key=`) or `null` values (`key=null`) are not displayed in the output JSON.
- If you define a *keyvalue* filter on a `data` object, and this filter is not matched, then an empty JSON `{}` is returned (for example, input: `key:=valueStr`, parsing rule: `rule_test %{data::keyvalue("=")}`, output: `{}`).
- Defining `""` as `quotingStr` keeps the default configuration for quoting.

### Parsing dates{% #parsing-dates %}

The date matcher transforms your timestamp in the EPOCH format (unit of measure **millisecond**).

| **Raw string**                | **Parsing rule**                                          | **Result**              |
| ----------------------------- | --------------------------------------------------------- | ----------------------- |
| 14:20:15                      | `%{date("HH:mm:ss"):date}`                                | {"date": 51615000}      |
| 02:20:15 PM                   | `%{date("hh:mm:ss a"):date}`                              | {"date": 51615000}      |
| 11/10/2014                    | `%{date("dd/MM/yyyy"):date}`                              | {"date": 1412978400000} |
| Thu Jun 16 08:29:03 2016      | `%{date("EEE MMM dd HH:mm:ss yyyy"):date}`                | {"date": 1466065743000} |
| Tue Nov 1 08:29:03 2016       | `%{date("EEE MMM d HH:mm:ss yyyy"):date}`                 | {"date": 1466065743000} |
| 06/Mar/2013:01:36:30 +0900    | `%{date("dd/MMM/yyyy:HH:mm:ss Z"):date}`                  | {"date": 1362501390000} |
| 2016-11-29T16:21:36.431+0000  | `%{date("yyyy-MM-dd'T'HH:mm:ss.SSSZ"):date}`              | {"date": 1480436496431} |
| 2016-11-29T16:21:36.431+00:00 | `%{date("yyyy-MM-dd'T'HH:mm:ss.SSSZZ"):date}`             | {"date": 1480436496431} |
| 06/Feb/2009:12:14:14.655      | `%{date("dd/MMM/yyyy:HH:mm:ss.SSS"):date}`                | {"date": 1233922454655} |
| 2007-08-31 19:22:22.427 ADT   | `%{date("yyyy-MM-dd HH:mm:ss.SSS z"):date}`               | {"date": 1188598942427} |
| Thu Jun 16 08:29:03 20161     | `%{date("EEE MMM dd HH:mm:ss yyyy","Europe/Paris"):date}` | {"date": 1466058543000} |
| Thu Jun 16 08:29:03 20161     | `%{date("EEE MMM dd HH:mm:ss yyyy","UTC+5"):date}`        | {"date": 1466047743000} |
| Thu Jun 16 08:29:03 20161     | `%{date("EEE MMM dd HH:mm:ss yyyy","+3"):date}`           | {"date": 1466054943000} |

1 Use the `timezone` parameter if you perform your own localizations and your timestamps are *not* in UTC. The supported format for timezones are:

- `GMT`, `UTC`, `UT` or `Z`
- `+hh:mm`, `-hh:mm`, `+hhmm`, `-hhmm`. The maximum supported range is from +18:00 to -18:00 inclusive.
- Timezones starting with `UTC+`, `UTC-`, `GMT+`, `GMT-`, `UT+` or `UT-`. The maximum supported range is from +18:00 to -18:00 inclusive.
- Timezone IDs pulled from the TZ database. For more information, see [TZ database names](https://en.wikipedia.org/wiki/List_of_tz_database_time_zones).

**Note**: Parsing a date **doesn't** set its value as the log official date. For this use the [Log Date Remapper](https://docs.datadoghq.com/logs/log_configuration/processors/log_date_remapper.md) in a subsequent Processor.

### Alternating pattern{% #alternating-pattern %}

If you have logs with two possible formats which differ in only one attribute, set a single rule using alternating with `(<REGEX_1>|<REGEX_2>)`. This rule is equivalent to a Boolean OR.

**Log**:

```text
john connected on 11/08/2017
12345 connected on 11/08/2017
```

**Rule**: Note that "id" is an integer and not a string.

```text
MyParsingRule (%{integer:user.id}|%{word:user.firstname}) connected on %{date("MM/dd/yyyy"):connect_date}
```

**Results**: `%{integer:user.id}`

```json
{
  "user": {
    "id": 12345
  },
  "connect_date": 1510099200000
}
```

`%{word:user.firstname}`

```json
{
  "user": {
    "firstname": "john"
  },
  "connect_date": 1510099200000
}
```

### Optional attribute{% #optional-attribute %}

Some logs contain values that only appear part of the time. In this case, make attribute extraction optional with `()?`.

**Log**:

```text
john 1234 connected on 11/08/2017
john connected on 11/08/2017
```

**Rule**:

```text
MyParsingRule %{word:user.firstname} (%{integer:user.id} )?connected on %{date("MM/dd/yyyy"):connect_date}
```

**Note**: A rule will not match if you include a space after the first word in the optional section.

**Result**: `(%{integer:user.id} )?`

```json
{
  "user": {
    "firstname": "john",
    "id": 1234
  },
  "connect_date": 1510099200000
}
```

`%{word:user.firstname} (%{integer:user.id} )?`

```json
{
  "user": {
    "firstname": "john",
  },
  "connect_date": 1510099200000
}
```

### Nested JSON{% #nested-json %}

Use the `json` filter to parse a JSON object nested after a raw text prefix:

**Log**:

```text
Sep 06 09:13:38 vagrant program[123]: server.1 {"method":"GET", "status_code":200, "url":"https://app.datadoghq.com/logs/pipelines", "duration":123456}
```

**Rule**:

```text
parsing_rule %{date("MMM dd HH:mm:ss"):timestamp} %{word:vm} %{word:app}\[%{number:logger.thread_id}\]: %{notSpace:server} %{data::json}
```

**Result**:

```json
{
  "timestamp": 1567761218000,
  "vm": "vagrant",
  "app": "program",
  "logger": {
    "thread_id": 123
  }
}
```

### Regex{% #regex %}

**Log**:

```text
john_1a2b3c4 connected on 11/08/2017
```

**Rule**:

```text
MyParsingRule %{regex("[a-z]*"):user.firstname}_%{regex("[a-zA-Z0-9]*"):user.id} .*
```

**Result**:

```json
{
  "user": {
    "firstname": "john",
    "id": "1a2b3c4"
  }
}
```

### List to array{% #list-to-array %}

Use the `array([[openCloseStr, ] separator][, subRuleOrFilter)` filter to extract a list into an array in a single attribute. The `subRuleOrFilter` is optional and accepts these [filters](https://docs.datadoghq.com/logs/log_configuration/parsing.md?tab=filters&tabs=filters#matcher-and-filter).

**Log**:

```text
Users [John, Oliver, Marc, Tom] have been added to the database
```

**Rule**:

```text
myParsingRule Users %{data:users:array("[]",",")} have been added to the database
```

**Result**:

```json
{
  "users": [
    "John",
    " Oliver",
    " Marc",
    " Tom"
  ]
}
```

**Log**:

```text
Users {John-Oliver-Marc-Tom} have been added to the database
```

**Rule**:

```text
myParsingRule Users %{data:users:array("{}","-")} have been added to the database
```

**Rule using `subRuleOrFilter`**:

```text
myParsingRule Users %{data:users:array("{}","-", uppercase)} have been added to the database
```

### Glog format{% #glog-format %}

Kubernetes components sometimes log in the `glog` format; this example is from the Kube Scheduler item in the Pipeline Library.

Example log line:

```text
W0424 11:47:41.605188       1 authorization.go:47] Authorization is disabled
```

Parsing rule:

```text
kube_scheduler %{regex("\\w"):level}%{date("MMdd HH:mm:ss.SSSSSS"):timestamp}\s+%{number:logger.thread_id} %{notSpace:logger.name}:%{number:logger.lineno}\] %{data:msg}
```

And extracted JSON:

```json
{
  "level": "W",
  "timestamp": 1587728861605,
  "logger": {
    "thread_id": 1,
    "name": "authorization.go"
  },
  "lineno": 47,
  "msg": "Authorization is disabled"
}
```

### Parsing XML{% #parsing-xml %}

The XML parser transforms XML formatted messages into JSON.

**Log:**

```text
<book category="CHILDREN">
  <title lang="en">Harry Potter</title>
  <author>J K. Rowling</author>
  <year>2005</year>
</book>
```

**Rule:**

```text
rule %{data::xml}
```

**Result:**

```json
{
"book": {
  "year": "2005",
  "author": "J K. Rowling",
  "category": "CHILDREN",
  "title": {
    "lang": "en",
    "value": "Harry Potter"
  }
}
}
```

**Notes**:

- If the XML contains tags that have both an attribute and a string value between the two tags, a `value` attribute is generated. For example: `<title lang="en">Harry Potter</title>` is converted to `{"title": {"lang": "en", "value": "Harry Potter" } }`
- Repeated tags are automatically converted to arrays. For example: `<bookstore><book>Harry Potter</book><book>Everyday Italian</book></bookstore>` is converted to `{ "bookstore": { "book": [ "Harry Potter", "Everyday Italian" ] } }`

### Parsing CSV{% #parsing-csv %}

Use the `csv` filter to more-easily map strings to attributes when separated by a given character (`,` by default).

The CSV filter is defined as `csv(headers[, separator[, quotingcharacter]])` where:

- `headers`: Defines the keys name separated by `,`. Keys names must start with alphabetical character and can contain any alphanumerical character in addition to `_`.
- `separator`: Defines separators used to separate the different values. Only one character is accepted. Default: `,`. **Note**: Use `tab` for the `separator` to represent the tabulation character for TSVs.
- `quotingcharacter`: Defines the quoting character. Only one character is accepted. Default: `"`

**Note**:

- Values containing a separator character must be quoted.
- Quoted Values containing a quoting character must be escaped with a quoting characters. For example, `""` within a quoted value represents `"`.
- If the log doesn't contain the same number of value as the number of keys in the header, the CSV parser will match the first ones.
- Integers and Double are automatically casted if possible.

**Log**:

```text
John,Doe,120,Jefferson St.,Riverside
```

**Rule**:

```text
myParsingRule %{data:user:csv("first_name,name,st_nb,st_name,city")}
```

**Result:**

```json
{
  "user": {
    "first_name": "John",
    "name": "Doe",
    "st_nb": 120,
    "st_name": "Jefferson St.",
    "city": "Riverside"
  }
}
```

Other examples:

| **Raw string**                     | **Parsing rule**                         | **Result**                                            |
| ---------------------------------- | ---------------------------------------- | ----------------------------------------------------- |
| `John,Doe`                         | `%{data::csv("firstname,name")}`         | {"firstname": "John", "name":"Doe"}                   |
| `"John ""Da Man""",Doe`            | `%{data::csv("firstname,name")}`         | {"firstname": "John "Da Man"", "name":"Doe"}          |
| `'John ''Da Man''',Doe`            | `%{data::csv("firstname,name",",","'")}` | {"firstname": "John 'Da Man'", "name":"Doe"}          |
| `John|Doe`                         | `%{data::csv("firstname,name","|")}`     | {"firstname": "John", "name":"Doe"}                   |
| `value1,value2,value3`             | `%{data::csv("key1,key2")}`              | {"key1": "value1", "key2":"value2"}                   |
| `value1,value2`                    | `%{data::csv("key1,key2,key3")}`         | {"key1": "value1", "key2":"value2"}                   |
| `value1,,value3`                   | `%{data::csv("key1,key2,key3")}`         | {"key1": "value1", "key3":"value3"}                   |
| `Value1    Value2    Value3` (TSV) | `%{data::csv("key1,key2,key3","tab")}`   | {"key1": "value1", "key2": "value2", "key3":"value3"} |

### Use data matcher to discard unneeded text{% #use-data-matcher-to-discard-unneeded-text %}

If you have a log where after you have parsed what is needed and know that the text after that point is safe to discard, you can use the data matcher to do so. For the following log example, you can use the `data` matcher to discard the `%` at the end.

**Log**:

```
Usage: 24.3%
```

**Rule**:

```
MyParsingRule Usage\:\s+%{number:usage}%{data:ignore}
```

**Result**:

```
{
  "usage": 24.3,
  "ignore": "%"
}
```

### ASCII control characters{% #ascii-control-characters %}

If your logs contain ASCII control characters, they are serialized upon ingestion. These can be handled by explicitly escaping the serialized value within your grok parser.

## Further Reading{% #further-reading %}

- [Learn how to build and modify log pipelines](https://learn.datadoghq.com/courses/log-pipelines)
- [Debugging Log Pipelines](https://learn.datadoghq.com/courses/debugging-log-pipelines)
- [Learn how to process your logs](https://docs.datadoghq.com/logs/log_configuration/processors.md)
- [Datadog Tips & Tricks: Use Grok parsing to extract fields from logs](https://www.youtube.com/watch?v=AwW70AUmaaQ&list=PLdh-RwQzDsaM9Sq_fi-yXuzhmE7nOlqLE&index=3)
- [How to investigate a log parsing issue?](https://docs.datadoghq.com/logs/faq/how-to-investigate-a-log-parsing-issue.md)
- [Log Parsing - Best Practices](https://docs.datadoghq.com/logs/guide/log-parsing-best-practice.md)
- [Control the volume of logs indexed by Datadog](https://docs.datadoghq.com/logs/logging_without_limits.md)