Ingestion Mechanisms

Ingestion Sampling Rules

Multiple mechanisms are responsible for choosing if spans generated by your applications are sent to Datadog (ingested). The logic behind these mechanisms lie in the tracing libraries and in the Datadog Agent. Depending on the configuration, all or some the traffic generated by instrumented services is ingested.

To each span ingested, there is attached a unique ingestion reason referring to one of the mechanisms described in this page. Usage metrics datadog.estimated_usage.apm.ingested_bytes and datadog.estimated_usage.apm.ingested_spans are tagged by ingestion_reason.

Use the Ingestion Reasons dashboard to investigate in context each of these ingestion reasons. Get an overview of the volume attributed to each mechanism, to quickly know which configuration options to focus on.

Head-based sampling

The default sampling mechanism is called head-based sampling. The decision of whether to keep or drop a trace is made at the very beginning of the trace, at the start of the root span. This decision is then propagated to other services as part of their request context, for example as an HTTP request header.

Because the decision is made at the beginning of the trace and then conveyed to all parts of the trace, the trace is guaranteed to be kept or dropped as a whole.

You can set sampling rates for head-based sampling in two places:

  • At the Agent level (default)
  • At the Tracing Library level: any tracing library mechanism overrides the Agent setup.

In the Agent

ingestion_reason: auto

The Datadog Agent continuously sends sampling rates to tracing libraries to apply at the root of traces. The Agent adjusts rates to achieve a target of overall ten traces per second, distributed to services depending on the traffic.

For instance, if service A has more traffic than service B, the Agent might vary the sampling rate for A such that A keeps no more than seven traces per second, and similarly adjust the sampling rate for B such that B keeps no more than three traces per second, for a total of 10 traces per second.

Set Agent’s target traces-per-second in its main configuration file (datadog.yaml) or as an environment variable :

@param max_traces_per_second - integer - optional - default: 10
@env DD_APM_MAX_TPS - integer - optional - default: 10

Note: The traces-per-second sampling rate set in the Agent only applies to Datadog tracing libraries. It has no effect on other tracing libraries such as OpenTelemetry SDKs.

All the spans from a trace sampled using the Datadog Agent automatically computed sampling rates are tagged with the ingestion reason auto. The ingestion_reason tag is also set on usage metrics. Services using the Datadog Agent default mechanism are labeled as Automatic in the Ingestion Control Page Configuration column.

In tracing libraries: user-defined rules

ingestion_reason: rule

For more granular control, use tracing library sampling configuration options:

  • Set a specific sampling rate to apply to all root services for the library, overriding the Agent’s default mechanism.
  • Set a sampling rate to apply to specific root services or for specific span operation names.
  • Set a rate limit on the number of ingested traces per second. The default rate limit is 100 traces per second per service instance (when using the Agent default mechanism, the rate limiter is ignored).

Note: These rules are also head-based sampling controls. If the traffic for a service is higher than the configured maximum traces per second, then traces are dropped at the root. It does not create incomplete traces.

The configuration can be set by environment variables or directly in the code:

For Java applications, set a global sampling rate in the library using the DD_TRACE_SAMPLE_RATE environment variable. Set by-service sampling rates with the DD_TRACE_SAMPLING_SERVICE_RULES environment variable.

For example, to send 20% of the traces for the service named my-service:

# using system property
java -Ddd.trace.sampling.service.rules=my-service:0.2 -javaagent:dd-java-agent.jar -jar my-app.jar

# using environment variables
export DD_TRACE_SAMPLING_SERVICE_RULES=my-service:0.2

Configure a rate limit by setting the environment variable DD_TRACE_RATE_LIMIT to a number of traces per second per service instance. If no DD_TRACE_RATE_LIMIT value is set, a limit of 100 traces per second is applied.

Read more about sampling controls in the Java tracing library documentation.

For Python applications, set a global sampling rate in the library using the DD_TRACE_SAMPLE_RATE environment variable. Set by-service sampling rates with the DD_TRACE_SAMPLING_RULES environment variable.

For example, to send 50% of the traces for the service named my-service and 10% for the rest of the traces:

@env DD_TRACE_SAMPLE_RATE=0.1
@env DD_TRACE_SAMPLING_RULES=[{"service": "my-service", "sample_rate": 0.5}]

Configure a rate limit by setting the environment variable DD_TRACE_RATE_LIMIT to a number of traces per second per service instance. If no DD_TRACE_RATE_LIMIT value is set, a limit of 100 traces per second is applied.

Read more about sampling controls in the Python tracing library documentation.

For Ruby applications, set a global sampling rate in the library using the DD_TRACE_SAMPLE_RATE environment variable.

You can also configure sampling rates by service. For instance, to send 20% of the traces for the service named my-service:

require 'ddtrace'

Datadog.configure do |c|
  c.tracing.sampler = Datadog::Tracing::Sampling::PrioritySampler.new(
    post_sampler: Datadog::Tracing::Sampling::RuleSampler.new(
      [
        # Sample all 'my-service' traces at 20.00%:
        Datadog::Tracing::Sampling::SimpleRule.new(service: 'my-service', sample_rate: 0.2000)
      ]
    )
  )
end

Configure a rate limit by setting the environment variable DD_TRACE_RATE_LIMIT to a number of traces per second per service instance. If no DD_TRACE_RATE_LIMIT value is set, a limit of 100 traces per second is applied.

Read more about sampling controls in the Ruby tracing library documentation.

For Go applications, set a global sampling rate for the library using the DD_TRACE_SAMPLE_RATE environment variable. Set by-service sampling rates with the DD_TRACE_SAMPLING_RULES environment variable.

For example, to send 50% of the traces for the service named my-service and 10% of the rest of the traces:

@env DD_TRACE_SAMPLE_RATE=0.1
@env DD_TRACE_SAMPLING_RULES=[{"service": `my-service`, "sample_rate": 0.5}]

Configure a rate limit by setting the environment variable DD_TRACE_RATE_LIMIT to a number of traces per second per service instance. If no DD_TRACE_RATE_LIMIT value is set, a limit of 100 traces per second is applied.

Read more about sampling controls in the Go tracing library documentation.

For Node.js applications, set a global sampling rate in the library using the DD_TRACE_SAMPLE_RATE environment variable.

You can also set by-service sampling rates. For instance, to send 50% of the traces for the service named my-service and 10% for the rest of the traces:

tracer.init({
  ingestion:
    sampler: {
      sampleRate: 0.1,
      rules: [
        { sampleRate: 0.5, service: 'my-service' }
      ]
    }
  }

Configure a rate limit by setting the environment variable DD_TRACE_RATE_LIMIT to a number of traces per second per service instance. If no DD_TRACE_RATE_LIMIT value is set, a limit of 100 traces per second is applied.

Read more about sampling controls in the NodeJS tracing library documentation.

For PHP applications, set a global sampling rate for the library using the DD_TRACE_SAMPLE_RATE environment variable. Set by-service sampling rates with the DD_TRACE_SAMPLING_RULES environment variable.

For example, to send 50% of the traces for the service named my-service and 10% for the rest of the traces:

@env DD_TRACE_SAMPLE_RATE=0.1
@env DD_TRACE_SAMPLING_RULES=[{"service": `my-service`, "sample_rate": 0.5}]

Read more about sampling controls in the PHP tracing library documentation.

Starting in version version 1.3.2, the Datadog C++ library supports the following configurations:

  • Global sampling rate: DD_TRACE_SAMPLE_RATE environment variable
  • Sampling rates by service: DD_TRACE_SAMPLING_RULES environment variable.
  • Rate limit setting: DD_TRACE_RATE_LIMIT environment variable.

For example, to send 50% of the traces for the service named my-service and 10% for the rest of the traces:

@env DD_TRACE_SAMPLE_RATE=0.1
@env DD_TRACE_SAMPLING_RULES=[{"service": `my-service`, "sample_rate": 0.5}]

C++ does not provide integrations for out-of-the-box instrumentation, but it’s used by proxy tracing such as Envoy, Nginx, or Istio. Read more about how to configure sampling for proxies in Tracing proxies.

For .NET applications, set a global sampling rate for the library using the DD_TRACE_SAMPLE_RATE environment variable. Set by-service sampling rates with the DD_TRACE_SAMPLING_RULES environment variable.

For example, to send 50% of the traces for the service named my-service and 10% for the rest of the traces:

@env DD_TRACE_SAMPLE_RATE=0.1
@env DD_TRACE_SAMPLING_RULES=[{"service": `my-service`, "sample_rate": 0.5}]

Configure a rate limit by setting the environment variable DD_TRACE_RATE_LIMIT to a number of traces per second per service instance. If no DD_TRACE_RATE_LIMIT value is set, a limit of 100 traces per second is applied.

Read more about sampling controls in the .NET tracing library documentation.

Note: All the spans from a trace sampled using a tracing library configuration are tagged with the ingestion reason rule. Services configured with user-defined sampling rules are marked as Configured in the Ingestion Control Page Configuration column.

Error and rare traces

For traces not caught by the head-based sampling, two additional Datadog Agent sampling mechanisms make sure that critical and diverse traces are kept and ingested. These two samplers keep a diverse set of local traces (set of spans from the same host) by catching all combinations of a predetermined set of tags:

  • Error traces: Sampling errors is important for providing visibility on potential system failures.
  • Rare traces: Sampling rare traces allows you to keep visibility on your system as a whole, by making sure that low-traffic services and resources are still monitored.

Note: Error and rare samplers are ignored for services for which you set library sampling rules.

Error traces

ingestion_reason: error

The error sampler catches pieces of traces that contain error spans that are not caught by head-based sampling. It catches error traces up to a rate of 10 traces per second (per Agent). It ensures comprehensive visibility on errors when the head-based sampling rate is low.

With Agent version 7.33 and forward, you can configure the error sampler in the Agent main configuration file (datadog.yaml) or with environment variables:

@param errors_per_second - integer - optional - default: 10
@env DD_APM_ERROR_TPS - integer - optional - default: 10

Note: Set the parameter to 0 to disable the error sampler.

Note: The error sampler captures local traces with error spans at the Agent level. If the trace is distributed, there is no way to guarantee that the complete trace will be sent to Datadog.

Rare traces

ingestion_reason: rare

The rare sampler sends a set of rare spans to Datadog. It catches combinations of env, service, name, resource, error.type, and http.status up to 5 traces per second (per Agent). It ensures visibility on low traffic resources when the head-based sampling rate is low.

In Agent version 7.33 and forward, you can disable the rare sampler in the Agent main configuration file (datadog.yaml) or with an environment variable:

@params apm_config.disable_rare_sampler - boolean - optional - default: false
@env DD_APM_DISABLE_RARE_SAMPLER - boolean - optional - default: false

Note: The rare sampler captures local traces at the Agent level. If the trace is distributed, there is no way to guarantee that the complete trace will be sent to Datadog.

Force keep and drop

ingestion_reason: manual

The head-based sampling mechanism can be overridden at the tracing library level. For example, if you need to monitor a critical transaction, you can force the associated trace to be kept. On the other hand, for unnecessary or repetitive information like health checks, you can force the trace to be dropped.

  • Set ManualKeep on a span to indicate that it and all child spans should be ingested. The resulting trace might appear incomplete in the UI if the span in question is not the root span of the trace.
// in dd-trace-go
span.SetTag(ext.ManualKeep, true)
span.SetTag(ext.ManualDrop, true)

Single spans (App Analytics)

ingestion_reason: analytic

On October 20, 2020, App Analytics was replaced by Tracing without Limits. This is a deprecated mechanism with configuration information relevant to legacy App Analytics. Instead, use new configuration options head-based sampling to have full control over your data ingestion.

If you need to sample a specific span, but don’t need the full trace to be available, tracers allow a sampling rate to be configured for a single span. This span will be ingested at no less than the configured rate, even when the enclosing trace is dropped.

In the tracing libraries

To use the analytics mechanism, enable it either by an environment variable or in the code. Also, define a sampling rate to be applied to all analytics_enabled spans:

@env  DD_TRACE_ANALYTICS_ENABLED - boolean - optional false
// in dd-trace-go
// set analytics_enabled by default
tracerconfig.WithAnalytics(on bool)
// set raw sampling rate to apply on all analytics_enabled spans
tracerconfig.SetAnalyticsRate(0.4)

Tag any single span with analytics_enabled:true. In addition, specify a sampling rate to be associated with the span:

// in dd-trace-go
// make a span analytics_enabled
span.SetTag(ext.AnalyticsEvent, true)
// make a span analytics_enabled with a rate of 0.5
s := tracer.StartSpan("redis.cmd", AnalyticsRate(0.5))

In the Agent

In the Agent, an additional rate limiter is set to 200 spans per second. If the limit is reached, some spans are dropped and not forwarded to Datadog.

Set the rate in the Agent main configuration file (datadog.yaml) or as an environment variable:

@param max_events_per_second - integer - optional 200
@env DD_APM_MAX_EPS - integer - optional 200

Product ingested spans

RUM Traces

ingestion_reason:rum

A request from a web or mobile application generates a trace when the backend services are instrumented. The APM integration with Real User Monitoring links web and mobile application requests to their corresponding backend traces so you can see your full frontend and backend data through one lens.

Starting in version 4.10.0 of the RUM browser SDK , you can control ingested volumes and keep a sampling of the backend traces by configuring the tracingSampleRate initialization parameter. Set tracingSampleRate to a number between 0 and 100. If no tracingSampleRate value is set, a default of 100% of the traces coming from the browser requests are sent to Datadog.

Similarly, control the trace sampling rate in other SDKs by using similar parameters:

SDKParameterMinimum version
BrowsertracingSampleRatev4.10.0
iOStracingSamplingRate1.11.0
AndroidtraceSamplingRate1.13.0
FluttertracingSamplingRate1.0.0-beta.2
React NativetracingSamplingRate1.0.0-rc6

Synthetic traces

ingestion_reason:synthetics and ingestion_reason:synthetics-browser

HTTP and browser tests generate traces when the backend services are instrumented. The APM integration with Synthetic Testing links your synthetic tests with the corresponding backend traces. Navigate from a test run that failed to the root cause of the issue by looking at the trace generated by that test run.

By default, 100% of synthetic HTTP and browser tests generate backend traces.

Other products

Some additional ingestion reasons are attributed to spans that are generated by specific Datadog products:

ProductIngestion ReasonIngestion Mechanism Description
Serverlesslambda and xrayYour traces received from the Serverless applications traced with Datadog Tracing Libraries or the AWS X-Ray integration.
Application Security MonitoringappsecTraces ingested from Datadog tracing libraries and flagged by ASM as a threat.

Further Reading