Split Array

Docs > Observability Pipelines > Processors > Split Array

This processor splits nested arrays into distinct events so that you can query, filter, alert, and visualize data within an array. The arrays need to already be parsed. For example, the processor can process [item_1, item_2], but cannot process "[item_1, item2]". The items in the array can be JSON objects, strings, integers, floats, or Booleans. All unmodified fields are added to the child events. For example, if you are sending the following items to the Observability Pipelines Worker:

{
    "host": "my-host",
    "env": "prod",
    "batched_items": [item_1, item_2]
}

Use the Split Array processor to send each item in batched_items as a separate event:

{
    "host": "my-host",
    "env": "prod",
    "batched_items": item_1
}

{
    "host": "my-host",
    "env": "prod",
    "batched_items": item_2
}

See the split array example for a more detailed example.

To set up this processor:

Click Manage arrays to split to add an array to split or edit an existing array to split. This opens a side panel.

If you have not created any arrays yet, enter the array parameters as described in the Add a new array section below.
If you have already created arrays, click on the array’s row in the table to edit or delete it. Use the search bar to find a specific array, and then select the array to edit or delete it. Click Add Array to Split to add a new array.

Add a new array

Define a filter query. Only logs that match the specified filter query are processed. All logs, regardless of whether they match the filter query, are sent to the next step in the pipeline.
Enter the path to the array field. Use the path notation <OUTER_FIELD>.<INNER_FIELD> to match subfields. See the Path notation example below.
Click Save.

Split array example

This is an example event:

{
    "ddtags": ["tag1", "tag2"],
    "host": "my-host",
    "env": "prod",
    "message": {
        "isMessage": true,
        "myfield" : {
            "timestamp":14500000,
            "firstarray":["one", 2]
        },
    },
    "secondarray": [
    {
        "some":"json",
        "Object":"works"
    }, 44]
}

If the processor is splitting the arrays "message.myfield.firstarray" and "secondarray", it outputs child events that are identical to the parent event, except for the values of "message.myfield.firstarray" and "secondarray", which becomes a single item from their respective original array. Each child event is a unique combination of items from the two arrays, so four child events (2 items * 2 items = 4 combinations) are created in this example.

{
    "ddtags": ["tag1", "tag2"],
    "host": "my-host",
    "env": "prod",
    "message": {
        "isMessage": true,
        "myfield" : {"timestamp":14500000, "firstarray":"one"},
    },
    "secondarray": {
        "some":"json",
        "Object":"works"
    }
}

{
    "ddtags": ["tag1", "tag2"],
    "host": "my-host",
    "env": "prod",
    "message": {
        "isMessage": true,
        "myfield" : {"timestamp":14500000, "firstarray":"one"},
        },
    "secondarray": 44
}

{
    "ddtags": ["tag1", "tag2"],
    "host": "my-host",
    "env": "prod",
    "message": {
        "isMessage": true,
        "myfield" : {"timestamp":14500000, "firstarray":2},
        },
    "secondarray": {
            "some":"json",
            "object":"works"
        }
}

{
    "ddtags": ["tag1", "tag2"],
    "host": "my-host",
    "env": "prod",
    "message": {
        "isMessage": true,
        "myfield" : {"timestamp":14500000, "firstarray":2},
        },
    "secondarray": 44
}

Path notation example

For the following message structure:

{
    "outer_key": {
        "inner_key": "inner_value",
        "a": {
            "double_inner_key": "double_inner_value",
            "b": "b value"
        },
        "c": "c value"
    },
    "d": "d value"
}

Use outer_key.inner_key to refer to the key with the value inner_value.
Use outer_key.inner_key.double_inner_key to refer to the key with the value double_inner_value.

Filter query syntax

Each processor has a corresponding filter query in their fields. Processors only process logs that match their filter query. And for all processors except the filter processor, logs that do not match the query are sent to the next step of the pipeline. For the filter processor, logs that do not match the query are dropped.

For any attribute, tag, or key:value pair that is not a reserved attribute, your query must start with @. Conversely, to filter reserved attributes, you do not need to append @ in front of your filter query.

For example, to filter out and drop status:info logs, your filter can be set as NOT (status:info). To filter out and drop system-status:info, your filter must be set as NOT (@system-status:info).

Filter query examples:

NOT (status:debug): This filters for only logs that do not have the status DEBUG.
status:ok service:flask-web-app: This filters for all logs with the status OK from your flask-web-app service.
- This query can also be written as: status:ok AND service:flask-web-app.
host:COMP-A9JNGYK OR host:COMP-J58KAS: This filter query only matches logs from the labeled hosts.
@user.status:inactive: This filters for logs with the status inactive nested under the user attribute.

Queries run in the Observability Pipelines Worker are case sensitive. Learn more about writing filter queries in Datadog’s Log Search Syntax.