Retain Data with Retention Filters

Cette page n'est pas encore disponible en français, sa traduction est en cours.
Si vous avez des questions ou des retours sur notre projet de traduction actuel, n'hésitez pas à nous contacter.

Overview

Retention filters are a set of queries, similar to those used in the RUM Session Explorer, that are executed against the RUM events (sessions, views, actions, resources, and so on) as they are ingested. These filters determine whether a session is stored for the standard 30-day RUM retention period or discarded.

The retention rate specifies the percentage of matching sessions you want to retain, which allows for greater cost control. Even though filters are matched against individual events, all the events from the underlying session are kept when a sampling decision is made, ensuring end-to-end visibility into user sessions.

How it works

A session is stored as soon as a retention filter matches one of its constituting events based on the predefined query, and samples it in based on the configured retention rate.

Diagram showing the logical flow of retention filters and how they impact the number of sessions ultimately retained.

The logical flow of retention filters is the following:

  • All RUM events are evaluated against each filter in sequence, starting with the first one received.
  • When an event A matches a filter, a decision is made based on the retention rate to either sample the entire session in, or wait for future events to be evaluated. In both cases, event A is not evaluated further against subsequent retention filters. This is why the order of retention filters matters.
  • Retained sessions are saved and accessible in the Session Explorer and other RUM pages. New events coming from this session do not go through the list of retention filters, but are automatically kept to ensure complete visibility.

Notes:

  • If an event does not match any filters, or if it matches a filter but the decision is made not to retain the session based on the configured retention rate, future events from the same session will continue to be evaluated. As a result, the session may eventually be retained.
  • Be cautious when defining filters on event attributes that update over time. For example, a filter retaining sessions with fewer than two errors might mistakenly retain sessions, as error counts update in real-time, and all sessions start at zero. Either use “greater than or equal to” (≥) conditions for fields that update, such as @session.error.count >= 2, or ensure the Session and View objects that are mutable are complete before evaluating them against the retention filters, by adding @session.is_active: false or @view.is_active: false.
  • Our SDKs batch and compress events before sending them to Datadog, and failed uploads go back at the end of the queue on the device. Therefore, it could happen that event B is evaluated before event A, but all events are eventually evaluated against the list of retention filters to prevent gaps.

How retention filters work with replays

You can manage session sampling with replays using retention filters. Whenever a session with replays is billed, both the session events and the video recording are kept and billed. This means that if you collect 100% of sessions and 100% of replays from SDKs, whenever a retention filter keeps a session, Datadog keeps and charges for both the session and the replay.

Replays collected through the force collection mechanism are kept by the default retention filter, positioned first in the list (see below).

When force collection is enabled, it is positioned first in the list of retention filters.

Note: Though Datadog’s mobile SDKs also provide APIs to conditionally start and stop the recording (instead of relying on a flat sample rate), only the replays that are force-recorded by the Browser SDK are retained by default.

Creating a retention filter

To create a retention filter:

  1. Navigate to Digital Experience > Manage Applications.
  2. Create a RUM application or click an existing application.
  3. Under Product Settings, go to the Retention Filters page.
  4. Click the + Add Retention Filter button.
  5. Give the retention filter a descriptive name.
  6. Select an event type from the dropdown and enter a query. Any query that can be written in the RUM Explorer works with retention filters.
  7. Optionally, set a retention rate against sessions that match the retention query. You can click Generate Estimate to help guide you in setting this rate.

The new filter gets added to the bottom of the Retention Filters list. It takes seconds for Datadog to propagate a new filter and start making sampling decisions.

Modifying filters

Hover over a retention filter to modify it.

Edit a filter

To modify an existing filter:

  1. Hover over the filter and click the Edit icon.
  2. Click Save Changes.

Duplicate a filter

To duplicate a filter:

  1. Hover over the filter and click the Duplicate icon.
  2. Make any modifications you want to the filter, then click Save Changes.

Delete a filter

To delete a retention filter:

  1. Hover over the filter and click the Delete icon.
  2. Click Confirm.

Disable a filter

Disabled filters simply ignore events and do not make any sampling decisions. Events flowing in the list will skip disabled filters.

Use the toggle to the right of the filter to disable or enable it.

Reorder filters

Drag and drop filters to reorder filters to their new position.

Excluding sessions using retention filters

RUM without Limits uses retention filters to specify which sessions to keep, rather than which to exclude. You cannot set a retention percentage to 0% (the default is 1%). Additionally, setting low retention percentages is not an effective exclusion strategy because sessions may still be retained by other filters in your configuration.

To ensure sessions from a particular environment, application version, device type, or other criteria are not retained, explicitly add exclusions inside the query of ALL OF YOUR FILTERS. For example:

  • Adding -version:(1* OR 2*) to all retention filters ensures you never keep events from older versions 1 and 2 of your application.
  • Adding -@device.type:Bot to all retention filters excludes search engine crawlers and other self-declared bots.
  • Adding -@geo.country:"South Korea" to all retention filters excludes all sessions from South Korea.

For example, to exclude sessions from South Korea while retaining all other sessions, create a filter with the query -@geo.country:"South Korea" and set the retention rate to 100%.

Note: There is no way to prevent a specific event from being retained. You can use negative queries (for instance, adding -@error.message:"Script error." to a retention filter targeting RUM Errors) to minimize the volume of undesired events, but other retention filters may still make a positive retention decision about a session that contains the event you tried to filter out.

Cross-product retention filters

Join the Preview!

Cross-Product Retention Filters are in Preview. Use this form to submit your request today.

Request Access

When configuring a RUM retention filter, you can enable cross-product retention filters for APM traces.

The APM traces filter indexes APM traces for the specified percentage of sessions retained by the parent RUM retention filter that have available traces.

The APM traces filter is only compatible with the following versions of the SDKs:
- Browser 6.5.0+
- Android 3.0.0+
- iOS 3.3.0+
- React Native 3.0.0+
Configuring cross-product retention filters may increase APM-indexed volumes.

Note: The availability of APM traces depends on the initialization parameter traceSampleRate of the SDK.

The cross-product retention filters allow you to optimize the correlation between different products to retain richer telemetry.

To find sessions with indexed APM traces in the RUM Explorer, query @session.has_indexed_apm_traces:true.

Example

Consider a configuration where you set up a unique RUM retention filter configured as follows:

A RUM retention filter targeting errors at 60% retention, with a cross-product filter set to 25% for APM Traces.

If you have initialized the SDK with traceSampleRate:40, then the outcome is the following:

  • 60% of sessions with at least one error are retained.
  • 25% x 40% = 10% of these retained sessions have the APM traces retained.
Cross-product retention filters only apply to sessions retained by the corresponding RUM retention filter. This means filters order matters for both RUM retention and cross-product filters.

For more information, see How it works.

1% flat sampling

For compatible SDKs (see above), Datadog provides a default RUM retention filter and cross-product retention filter on APM traces that retains 1% of the sessions with available traces and their traces, at no additional cost.

This default filter helps ensure that you always have a baseline of correlated APM data available for your RUM sessions, even before custom cross-product retention filters.

To find sessions retained by this filter in the RUM Explorer, query @session.retention_reason:apm_rum_flat_sampling.

Best practices

See Retention Filter Best Practices.

API

Retention filters can be managed through APIs or Datadog’s dedicated Terraform modules.

Next steps

Analyze performance with metrics.

Further reading