---
isPrivate: true
title: (LEGACY) Working with Data
description: Datadog, the leading service for cloud-scale monitoring.
breadcrumbs: >-
  Docs > Observability Pipelines > (LEGACY) Observability Pipelines
  Documentation > (LEGACY) Working with Data
---

# (LEGACY) Working with Data

{% callout %}
# Important note for users on the following Datadog sites: app.ddog-gov.com

{% alert level="danger" %}
This product is not supported for your selected [Datadog site](https://docs.datadoghq.com/getting_started/site). ().
{% /alert %}

{% /callout %}

## Overview{% #overview %}

Observability Pipelines enables you to shape and transform observability data. Similar to Logging without Limits™ pipelines, you can configure pipelines for Observability Pipelines that are composed of a series of transform components. These transforms allow you to parse, structure, and enrich data with built-in type safety.

## Remap data{% #remap-data %}

The [`remap` transform](https://docs.datadoghq.com/observability_pipelines/legacy/reference/transforms/#remap) can modify events or specify conditions for routing and filtering events. Use Datadog Processing Language (DPL), or Vector Remap Language (VRL), in the `remap` transform to manipulate arrays and strings, encode and decode values, encrypt and decrypt values, and more. See [Datadog Processing Language](https://docs.datadoghq.com/observability_pipelines/legacy/reference/processing_language/) for more information and the [DPL Functions reference](https://docs.datadoghq.com/observability_pipelines/legacy/reference/processing_language/functions/) for a full list of DPL built-in functions.

### Basic `remap` configuration example{% #basic-remap-configuration-example %}

To get started, see the following YAML configuration example for a basic `remap` transform that contains a DPL/VRL program in the `source` field:

```yaml
transforms:
  modify:
    type: remap
    inputs:
      - previous_component_id
    source: |2
        del(.user_info)
        .timestamp = now()
```

In this example, the `type` field is set to a `remap` transform. The `inputs` field defines where it receives events from the previously defined `previous_component_id` source. The first line in the `source` field deletes the `.user_info` field. At scale, dropping fields is particularly useful for reducing the payload of your events and cutting down on spend for your downstream services.

The second line adds the `.timestamp` field and the value to the event, changing the content of every event that passes through this transform.

## Parse data{% #parse-data %}

Parsing provides more advanced use cases of DPL/VRL.

### Parsing example{% #parsing-example %}

#### Log event example{% #log-event-example %}

The below snippet is an HTTP log event in JSON format:

```
"{\"status\":200,\"timestamp\":\"2021-03-01T19:19:24.646170Z\",\"message\":\"SUCCESS\",\"username\":\"ub40fan4life\"}"
```

#### Configuration example{% #configuration-example %}

The following YAML configuration example uses DPL/VRL to modify the log event by:

- Parsing the raw string into JSON.
- Reformatting the time into a UNIX timestamp.
- Removing the username field.
- Converting the message to lowercase.

```yaml
transforms:
  parse_syslog_id:
    type: remap
    inputs:
      - previous_component_id
    source: |2
         . = parse_json!(string!(.message))
         .timestamp = to_unix_timestamp(to_timestamp!(.timestamp))
         del(.username)
         .message = downcase(string!(.message))
```

#### Configuration output{% #configuration-output %}

The configuration returns the following:

```
{
  "message": "success",
  "status": 200,
  "timestamp": 1614626364
}
```

## Sample, reduce, filter, and aggregate data{% #sample-reduce-filter-and-aggregate-data %}

Sampling, reducing, filtering, and aggregating are common transforms to reduce the volume of observability data delivered to downstream services. Observability Pipelines offers a variety of ways to control your data volume:

- [Sample events](https://docs.datadoghq.com/observability_pipelines/legacy/reference/transforms/#sample) based on supplied criteria and at a configurable rate.
- [Reduce and collapse](https://docs.datadoghq.com/observability_pipelines/legacy/reference/transforms/#reduce) multiple events into a single event.
- Remove unnecessary fields.
- [Deduplicate](https://docs.datadoghq.com/observability_pipelines/legacy/reference/transforms/#dedupe) events.
- [Filter events](https://docs.datadoghq.com/observability_pipelines/legacy/reference/transforms/#filter) based on a set of conditions.
- [Aggregate multiple metric events](https://docs.datadoghq.com/observability_pipelines/legacy/reference/transforms/#aggregate) into a single metric event based on a defined interval window.

See [Control Log Volume and Size](https://docs.datadoghq.com/observability_pipelines/legacy/guide/control_log_volume_and_size/) for examples on how to use these transforms.

## Route data{% #route-data %}

Another commonly used transform is `route`, which allows you to split a stream of events into multiple substreams based on supplied conditions. This is useful when you need to send observability data to different destinations or operate differently on streams of data based on their use case.

### Routing to different destinations example{% #routing-to-different-destinations-example %}

#### Log example{% #log-example %}

The below snippet is an example log that you want to route to different destinations based on the value of the `level` field.

```
{
  "logs": {
    "kind": "absolute",
    "level": "info,
    "name": "memory_available_bytes",
    "namespace": "host",
    "tags": {}
  }
}
```

#### Configuration examples{% #configuration-examples %}

The following YAML configuration example routes data based on the `level` value:

```yaml
transforms:
  splitting_logs_id:
    type: route
    inputs:
      - my-source-or-transform-id
    route:
      debug: .level == "debug"
      info: .level == "info"
      warn: .level == "warn"
      error: .level == "error"
```

Each row under the `route` field defines a route identifier, followed by a logical condition representing the filter of the `route`. The end result of this `route` can then be referenced as an input by other components with the name `<transform_name>.<route_id>`.

For example, if you wish to route logs with `level` field values of `warn` and `error` to Datadog, see the following example:

```yaml
sinks:
  my_sink_id:
    type: datadog_logs
    inputs:
      - splitting_logs_id.warn
      - splitting_logs_id.error
    default_api_key: '${DATADOG_API_KEY_ENV_VAR}'
    compression: gzip
```

See the [`route` transform reference](https://docs.datadoghq.com/observability_pipelines/legacy/reference/transforms/#route) for more information.

## Throttle data{% #throttle-data %}

Downstream services can sometimes get overwhelmed when there is a spike in volume, which can lead to data being dropped. Use the `throttle` transform to safeguard against this scenario and also enforce usage quotas on users. The `throttle` transform rate limits logs passing through a topology.

### Throttle configuration example{% #throttle-configuration-example %}

The following YAML configuration example is for a `throttle` transform:

```yaml
transforms:
  my_transform_id:
    type: throttle
    inputs:
      - my-source-or-transform-id
    exclude: null
    threshold: 100
    window_secs: 1
```

The `threshold` field defines the number of events allowed for a given bucket. `window_secs` defines the time frame in which the configured threshold is applied. In the example configuration, when the component receives more than 100 events in a span of 1 second, any additional events are dropped.

## Further Reading{% #further-reading %}

- [Set up Observability Pipelines](https://docs.datadoghq.com/observability_pipelines/legacy/setup/)
- [Parsing metadata emitted by AWS EC2 instance](https://docs.datadoghq.com/observability_pipelines/legacy/reference/transforms/#awsec2metadata)
- [Modifying events with Lua](https://docs.datadoghq.com/observability_pipelines/legacy/reference/transforms/#lua)
- [Convert logs to metric events](https://docs.datadoghq.com/observability_pipelines/legacy/reference/transforms/#logtometric)
- [Learn more about Observability Pipelines configurations](https://docs.datadoghq.com/observability_pipelines/legacy/configurations/)
