This processor parses Extensible Markup Language (XML) so the data can be processed and sent to different destinations. XML is a log format used to store and transport structured data. It is organized in a tree-like structure to represent nested information and uses tags and attributes to define the data. For example, this is XML data using only tags (<recipe>
,<type>
, and <name>
) and no attributes:
<recipe>
<type>pasta</type>
<name>Carbonara</name>
</recipe>
This is an XML example where the tag recipe
has the attribute type
:
<recipe>
<recipe type="pasta">
<name>Carbonara</name>
</recipe>
To set up this processor:
- Define a filter query. Only logs that match the specified filter query are processed. All logs, regardless of whether they match the filter query, are sent to the next step in the pipeline.
- Enter the path to the log field on which you want to parse XML. Use the path notation
<OUTER_FIELD>.<INNER_FIELD>
to match subfields. See the Path notation example below. - Optionally, in the
Enter text key
field, input the key name to use for the text node when XML attributes are appended. See the text key example. If the field is left empty, value
is used as the key name. - Optionally, select
Always use text key
if you want to store text inside an object using the text key even when no attributes exist. - Optionally, toggle
Include XML attributes
on if you want to include XML attributes. You can then choose to add the attribute prefix you want to use. See attribute prefix example. If the field is left empty, the original attribute key is used. - Optionally, select if you want to convert data types into numbers, Booleans, or nulls.
- If Numbers is selected, numbers are parsed as integers and floats.
- If Booleans is selected,
true
and false
are parsed as Booleans. - If Nulls is selected, the string
null
is parsed as null.
Path notation example
For the following message structure, use outer_key.inner_key.double_inner_key
to refer to the key with the value double_inner_value
.
{
"outer_key": {
"inner_key": "inner_value",
"a": {
"double_inner_key": "double_inner_value",
"b": "b value"
},
"c": "c value"
},
"d": "d value"
}
Always use text key example
If Always use text key is selected, the text key is the default (value
), and you have the following XML:
<recipe>
<recipe type="pasta">
<name>Carbonara</name>
</recipe>
The XML is converted to:
{
"recipe": {
"type": "pasta",
"value": "Carbonara"
}
}
Text key example
If the key is text
and you have the following XML:
<recipe>
<recipe type="pasta">
<name>Carbonara</name>
</recipe>
The XML is converted to:
{
"recipe": {
"type": "pasta",
"text": "Carbonara"
}
}
Attribute prefix example
If you enable Include XML attributes, the attribute is added as a prefix to each XML attribute. For example, if the attribute prefix is @
and you have the following XML:
<recipe type="pasta">Carbonara</recipe>
Then it is converted to the JSON:
{
"recipe": {
"@type": "pasta",
"<text key>": "Carbonara"
}
}
Filter query syntax
Each processor has a corresponding filter query in their fields. Processors only process logs that match their filter query. And for all processors except the filter processor, logs that do not match the query are sent to the next step of the pipeline. For the filter processor, logs that do not match the query are dropped.
For any attribute, tag, or key:value
pair that is not a reserved attribute, your query must start with @
. Conversely, to filter reserved attributes, you do not need to append @
in front of your filter query.
For example, to filter out and drop status:info
logs, your filter can be set as NOT (status:info)
. To filter out and drop system-status:info
, your filter must be set as NOT (@system-status:info)
.
Filter query examples:
NOT (status:debug)
: This filters for only logs that do not have the status DEBUG
.status:ok service:flask-web-app
: This filters for all logs with the status OK
from your flask-web-app
service.- This query can also be written as:
status:ok AND service:flask-web-app
.
host:COMP-A9JNGYK OR host:COMP-J58KAS
: This filter query only matches logs from the labeled hosts.@user.status:inactive
: This filters for logs with the status inactive
nested under the user
attribute.
Queries run in the Observability Pipelines Worker are case sensitive. Learn more about writing filter queries in Datadog’s Log Search Syntax.