Tracing Application Metrics

Tracing Application Metrics

Overview

Tracing application metrics are collected after enabling trace collection and instrumenting your application. These metrics are available for dashboards and monitors. The trace metrics namespace is formatted as:

  • trace.<SPAN_NAME>.<METRIC_SUFFIX>
  • trace.<SPAN_NAME>.<METRIC_SUFFIX>.<2ND_PRIM_TAG>_service

With the following definitions:

<SPAN_NAME>
The name of the operation or span.name (examples: redis.command, pylons.request, rails.request, mysql.query).
<METRIC_SUFFIX>
The name of the metric (examples: duration, hits, span_count). See the section below.
<2ND_PRIM_TAG>
If the metric name accounts for the second primary tag, this tag is part of the metric name.
<TAGS>
Trace metrics tags, possible tags are: env, service, version, resource, sublayer_type, sublayer_service, http.status_code, http.status_class, Datadog Agent tags (including the host and second primary tag). Note: Tags set on spans do not count and will not be available as tags for your traces metrics.

Metric Suffix

Hits

trace.<SPAN_NAME>.hits
Prerequisite: This metric exists for any APM service.
Description: Represent the count of hits for a given span.
Metric type: COUNT.
Tags: env, service, version, resource, http.status_code, all host tags from the Datadog Host Agent, and the second primary tag.
trace.<SPAN_NAME>.hits.by_http_status
Prerequisite: This metric exists for HTTP/WEB APM services if http metadata exists.
Description: Represent the count of hits for a given span break down by HTTP status code.
Metric type: COUNT.
Tags: env, service, version, resource, http.status_class, http.status_code, all host tags from the Datadog Host Agent, and the second primary tag.

Percentile aggregation

trace.<SPAN_NAME>.duration.by.resource_<2ND_PRIM_TAG>_service.<PERCENTILE_AGGREGATION>
Prerequisite: This metric exists for any APM service.
Description: Measure the total time spent processing by resource, service, and 2nd primary tag.
Metric type: GAUGE.
Percentile Aggregations: 100p, 50p, 75p, 90p, 95p, 99p
Tags: env, service, resource, and the second primary tag.
trace.<SPAN_NAME>.duration.by.resource_service.<PERCENTILE_AGGREGATION>
Prerequisite: This metric exists for any APM service.
Description: Measure the total time spent processing for each resource and service combination.
Metric type: GAUGE.
Percentile Aggregations: 100p, 50p, 75p, 90p, 95p, 99p
Tags: env, service, and resource.
trace.<SPAN_NAME>.duration.by.<2ND_PRIM_TAG>_service.<PERCENTILE_AGGREGATION>
Prerequisite: This metric exists for any APM service.
Description: Measure the total time spent processing for each 2nd primary tag and service combination.
Metric type: GAUGE.
Percentile Aggregations: 100p, 50p, 75p, 90p, 95p, 99p
Tags: env, service, and the second primary tag.
trace.<SPAN_NAME>.duration.by.service.<PERCENTILE_AGGREGATION>
Prerequisite: This metric exists for any APM service.
Description: Represents the duration for an individual span. It’s used to track latency and answer questions like, “what’s the median wait time a user experienced?” or “how long do the slowest 1% of users have to wait?”.
Metric type: GAUGE.
Percentile Aggregations: 100p, 50p, 75p, 90p, 95p, 99p
Tags: env and service.

Errors

trace.<SPAN_NAME>.errors
Prerequisite: This metric exists for any APM service.
Description: Represent the count of errors for a given span.
Metric type: COUNT.
Tags: env, service, version, resource, http.status_code, all host tags from the Datadog Host Agent, and the second primary tag.
trace.<SPAN_NAME>.errors.by_http_status
Prerequisite: This metric exists for any APM service.
Description: Represent the count of errors for a given span.
Metric type: COUNT.
Tags: env, service, version, resource, http.status_class, http.status_code, all host tags from the Datadog Host Agent, and the second primary tag.

Span count

Note: This is a deprecated namespace.

trace.<SPAN_NAME>.span_count
Prerequisite: This metric exists for any APM service.
Description: Represent the amount of spans collected on a given interval.
Metric type: COUNT.
Tags: env, service, resource, all host tags on from the Datadog Host Agent, and the second primary tag.
trace.<SPAN_NAME>.span_count.by_http_status
Prerequisite: This metric exists for HTTP/WEB APM services if http metadata exists.
Description: Represent the amount of spans collected on a given interval break down by HTTP status.
Metric type: COUNT.
Tags: env, service, resource, http.status_class, http.status_code, all host tags from the Datadog Host Agent, and the second primary tag.

Duration

trace.<SPAN_NAME>.duration
Prerequisite: This metric exists for any APM service.
Description: Measure the total time for a collection of spans. Specifically, it is the total time spent by all spans over an interval - including time spent waiting on child processes.
Metric type: GAUGE.
Tags: env, service, resource, http.status_code, all host tags from the Datadog Host Agent, and the second primary tag.

Duration by

trace.<SPAN_NAME>.duration.by_http_status
Prerequisite: This metric exists for HTTP/WEB APM services if http metadata exists.
Description: Measure the total time for a collection of spans for each HTTP status. Specifically, it is the relative share of time spent by all spans over an interval and a given HTTP status - including time spent waiting on child processes.
Metric type: GAUGE.
Tags: env, service, resource, http.status_class, http.status_code, all host tags from the Datadog Host Agent, and the second primary tag.
trace.<SPAN_NAME>.duration.by_service
Prerequisite: This metric exists for any APM service.
Description: Measure the total time spent actually processing for each service (i.e. it excludes time spent waiting on child processes).
Metric type: GAUGE.
Tags: env, service, resource, sublayer_service, http.status_code, all host tags from the Datadog Host Agent, and the second primary tag.
trace.<SPAN_NAME>.duration.by_type
Prerequisite: This metric exists for any APM service.
Description: Measure the total time spent actually processing for each Service type.
Metric type: GAUGE.
Tags: env, service, resource, sublayer_type, http.status_code, all host tags from the Datadog Host Agent, and the second primary tag.
trace.<SPAN_NAME>.duration.by_type.by_http_status
Prerequisite: This metric exists for HTTP/WEB APM services if http metadata exists.
Description: Measure the total time spent actually processing for each Service type and HTTP status.
Metric type: GAUGE.
Tags: env, service, resource, sublayer_type, http.status_class, http.status_code, all host tags from the Datadog Host Agent, and the second primary tag.
trace.<SPAN_NAME>.duration.by_service.by_http_status
Prerequisite: This metric exists for HTTP/WEB APM services if http metadata exists.
Description: Measure the total time spent actually processing for each Service and HTTP status.
Metric type: GAUGE.
Tags: env, service, resource, sublayer_service, http.status_class, http.status_code, all host tags from the Datadog Host Agent, and the second primary tag.

Apdex

trace.<SPAN_NAME>.apdex.by.resource_<2ND_PRIM_TAG>_service
Prerequisite: This metric exists for any HTTP/WEB APM service.
Description: Represent the Apdex score for all combination of resources, 2nd primary tags and services.
Metric type: GAUGE.
Tags: env, service, resource, and the second primary tag.
trace.<SPAN_NAME>.apdex.by.resource_service
Prerequisite: This metric exists for any HTTP/WEB APM service.
Description: Measure the Apdex score for each combination of resources and web services.
Metric type: GAUGE.
Tags: env, service, and resource
trace.<SPAN_NAME>.apdex.by.<2ND_PRIM_TAG>_service
Prerequisite: This metric exists for any HTTP/WEB APM service.
Description: Measure the Apdex score for each combination of 2nd primary tag and web services.
Metric type: GAUGE.
Tags: env, service, and the second primary tag.
trace.<SPAN_NAME>.apdex.by.service
Prerequisite: This metric exists for any HTTP/WEB APM service.
Description: Measure the Apdex score for each web services.
Metric type: GAUGE.
Tags: env and service.

Further Reading