With Tracing without Limits™, both the ingestion of traces to Datadog as well as the retention of those traces for 15 days are fully customizable.
To track or monitor your usage of Tracing without Limits™, see the Usage Metrics documentation.
After spans have been ingested by Datadog, some will be kept for 15 days according to the retention filters that have been set on your account. By default, the only retention filter enabled will be the Intelligent Retention Filter, which retains error traces and traces from different latency distributions.
You can also create any number of additional tag-based retention filters for your services.
Note: Admin rights are required to create, modify, or disable retention filters.
In the Datadog app, on the Retention Filters tab, you can see the following information:
In addition to the ‘Spans Indexed’ column per retention filter, there is also the metric
datadog.estimated_usage.apm.indexed_spans that you can use to track spans indexed by retention filters.
Intelligent retention is always active for your services, and it keeps a proportion of traces to help you monitor the health of your applications. All top level spans are indexed for the traces kept by the intelligent retention filter.
Intelligent Retention retains:
If there are specific tags, facets, or groups of traces that you want to investigate in detail, meaning you want to retain more than what Intelligent Retention retains, then create your own retention filter. For example, you might want to keep more than a representative selection of errors from your production environment. To ensure all production errors are retained and available for search and analytics for 15 days, create a 100 percent retention filter scoped to
status:error. As discussed below, this may have an impact on your bill.
To customize what spans are indexed and retained for 15 days, you can create, modify, and disable additional filters based on tags, and set a percentage of spans matching each filter to be retained. Any span that is retained will have its corresponding trace saved as well, and when it is viewed, the complete trace will be available. In order to be searched by tag in Search and Analytics, however, the span that directly contains the searched-upon tag must have been indexed by a retention filter.
Note: Selecting “Top-Level Spans for Services Only” means the retention filter will retain only the selected proportion of top level spans of service and index them. Use this if you want to only index top level spans with matching tags. If “All Spans” is selected, the retention filter will retain the selected proportion of all spans of the distributed trace, irrespective of their hierarchy, and index them. This may have an impact on your bill, and the visual indicator within the app while setting a retention filter will inform you how many matching spans have been detected over the time period.
For example, you can create filters to keep all traces for:
Ingestion Controls affect what traces are sent by your applications to Datadog. Stats and metrics are always calculated based on all traces, and are not impacted by ingestion controls.
Many instrumented services will send 100% of their traces to Datadog by default. The Datadog Agent will not drop or sample any spans by default at volumes of up to 50 traces per second. High-volume services or services that experience intermittent traffic are likelier to not send 100% of spans by default. This 50-traces-per-second default ingestion is based on Intelligent Retention and will keep diverse traces by default.
For the best experience, set services to send 100% of their traces so that all traces can be used for live search and analytics.
Note: If you are seeing numbers below 100% for Ingestion Rate, ensure you are using Agent 6.19+ or 7.19+ as these versions increased the default rate.
In the Datadog app, on the ‘Ingestion Controls’ tab, you can see the following information:
Defaultunless changed by using the instructions in-app to configure the tracer. See Change the Default Ingestion Rate for more information. If all hosts with this service deployed are configured to send a specific volume of traces, this indicator displays
Fully Configured. If only a portion of hosts with this service deployed are configured, the label will instead show
In addition to the Data Ingestion column for each retention filter, there are also two metrics
datadog.estimated_usage.apm.ingested_bytes. These metrics are tagged by
env, and top lists are available within the Trace Analytics Dashboard to show where the highest ingestion volumes are occurring. See the Usage Metrics documentation for more information.
To specify that a specific percentage of a service’s traffic should be sent, add a generated code snippet to your tracer configuration for that service.
In order to ingest 100% of your traces in Datadog for all services for live search and analytics as well as to have the most control with retention filters, Datadog recommends configuring all services to send 100% of traces by default.
To configure for 100% ingestion on every service instrumented with a Datadog tracing library, set the following environment variable in the tracer configuration:
Note: This may impact your bill if your total ingestion exceeds the included GBs. For more information, see the APM Billing page.
The Ingestion Breakdown column breaks down the destination of all traces originating from the service. It can help you understand lower than expected ingestion rates and missing traces.
The breakdown is composed of the following parts:
Complete traces ingested (green): The percentage of traces that have been ingested by Datadog.
Complete traces not retained (gray): The percentage of traces that have intentionally not been forwarded to Datadog by the agent or the tracer. This can happen for one of two reasons depending on your configuration:
Complete traces dropped by the tracer rate limiter (orange): When you choose to configure the ingestion rate of a service, you explicitly define the ingestion rate that your service should have. However, as a protection mechanism, a rate limiter set to 100 traces per second by default is automatically enabled. To configure this rate limiter, open a support ticket so we can guide you through the process.
Traces dropped due to the agent CPU limit (red): The agent has a configuration option allowing users to limit the usage of the CPU. After this limit is reached the agent will stop accepting traces from the tracers. Change the agent configuration to configure how much CPU to allocate to the agent.
You won’t get 100% trace ingestion if you have not set the environment variable configuration
DD_TRACE_SAMPLE_RATE=1.0 for Tracing without Limits, and:
In this case, some traces will be dropped by the Datadog Agent after stats are computed, so that metrics calculated will be based on 100% of your traces.
If you are seeing ingestion rates below 100% within Datadog and would like to send all your traces, enable Tracing without Limits by setting the environment variable as described above. If you have questions, contact our support team.
Before October 20, 2020, Datadog offered App Analytics to index spans for performing analytics. While this is no longer the recommended setup configuration and is not needed to use Trace Search and Analytics, the legacy instructions are available within the App Analytics setup page.
All existing App Analytics filters have been automatically transitioned to Retention Filters. You can continue to use the unchanged filters or modify them as needed. Transitioned filters are marked with an i representing Legacy App Analytics Filters.