Network Performance Monitoring is now generally available! Network Monitoring is now available!

Custom Metrics

Overview

If a metric is not submitted from one of the 350+ Datadog integrations it’s considered a custom metric. Note: Some standard integrations emit custom metrics.

A custom metric refers to a unique combination of metric name, host, and tag values. In general, any metric you send using StatsD, DogStatsD, or through extensions made to the Datadog Agent is a custom metric.

Allocation

You are allocated a certain number of custom metrics based on your Datadog pricing plan:

  • Pro: 100 custom metrics per host
  • Enterprise: 200 custom metrics per host

These allocations are counted across your entire infrastructure. For example, if you are on the Pro plan and licensed for three hosts, 300 custom metrics are allocated. The 300 custom metrics can be divided equally across each host, or all 300 metrics could be used by a single host.

Using this example, the graphic below shows scenarios that do not exceed the allocated custom metric count:

There are no enforced fixed rate limits on custom metric submission. If your default allotment is exceeded, you are billed according to Datadog’s billing policy for custom metrics.

Tracking

Administrative users can see the total custom metrics per hour and the top 500 custom metrics for their account on the usage details page. See the Usage Details documentation for more information.

Counting

Using tags on custom metrics can change the total number of unique tag combinations, which ultimately changes the total count of custom metrics created. The examples below show how custom metrics are counted.

Single host

This example assumes you are submitting a COUNT metric (auth.exceptionCount) on a single host.

  • Your code instrumentation submits the following possible tags associated with the metric: method:X, method:Y, exception:A, exception:B.
  • The logic behind your metric tagging is:

  • In this situation, you have 6 different metrics. The unique metrics for a single host are:

MetricTags
auth.exceptionCountmethod:X
auth.exceptionCountmethod:Y
auth.exceptionCountmethod:X, exception:A
auth.exceptionCountmethod:X, exception:B
auth.exceptionCountmethod:Y, exception:A
auth.exceptionCountmethod:Y, exception:B

Note: The ordering of tags does not effect the metric count. The following metrics are not unique:

  • auth.exceptionCount with tags method:X and exception:A
  • auth.exceptionCount with tags exception:A and method:X

Multiple hosts

A custom metric is uniquely identified by a unique combination of a metric name, host, and tag values. Therefore, reporting a custom metric from multiple hosts results in multiple unique tag value combinations.

For example, you create the metric service.request.count to gain insight from different services across your infrastructure.

  • To track successes and failures, so you create the tag status with two tag values:
    • status:success
    • status:failure
  • You track the metric by each service running on your infrastructure:
    • service:database
    • service:api
    • service:webserver
  • The logic behind your metric is:

For this example, only a subset of services and statuses are reporting. You have three hosts:

  • host1 is reporting all possible configurations.
  • host2 is reporting only successes across all services.
  • host3 is reporting successes and failures for database and webserver services.

Across your three hosts, there are 13 distinct metrics (timeseries):

Note: If all services report both statuses, you have 1 x 2 x 3 = 6 custom metrics per host.

Metric summary

The Metric Summary page displays the number of distinct metrics which is equivalent to number of distinct timeseries associated with a given metric name, host, tag values. For example, service.request.count with 1 host, 2 statuses, and 3 services = 6 distinct metrics:

Adding a second host with 3 services and 1 status = 9 distinct metrics:

Adding a third host with 2 services and 2 status = 13 distinct metrics:

Query editor

You can count your custom metrics by using the count: aggregator in the query editor. For counting the previous example, the query count:service.request.count{*} is used:

Gauges, counts, histograms, and rates

Suppose you are interested in measuring the average temperature in the state of Florida. temperature is stored as a GAUGE metric type in Datadog. You collect the following temperature measurements every 10 seconds during the past minute from Orlando, Miami, Boston, New York, and Seattle, each tagged with information about the city, state, region, and country.

Tags0-10 s10-20 s20-30 s30-40 s40-50 s50-60 s
Orlando, FL, Southeast, USA808080808181
Miami, FL, Southeast, USA828282828282
Boston, MA, Northeast, USA787878787879
New York, NY, Northeast, USA797979797979
Seattle, WA, Northwest, USA757575757575

The total number of custom metrics associated with the temperature metric is five. Each unique tag combination of city, state, region, and country represents a custom metric:

MetricTags
temperaturecity:orlando, state:fl, region:southeast, country:usa
temperaturecity:miami, state:fl, region:southeast, country:usa
temperaturecity:boston, state:ma, region:northeast, country:usa
temperaturecity:new_york, state:ny, region:northeast, country:usa
temperaturecity:seattle, state:wa, region:northwest, country:usa

Using the five timeseries above, you can determine the average temperature in the US, Northeast, or Florida at query time.

Note: The same scheme for counting custom metrics is applied to COUNT, HISTOGRAM, and RATE metric types.

Dropping tags

Suppose you drop the country tag from the temperature metric. There are still five unique tag value combinations of city, state, and region, so the total number of custom metrics emitted from the temperature metric is five:

MetricTags
temperaturecity:orlando, state:fl, region:southeast
temperaturecity:miami, state:fl, region:southeast
temperaturecity:boston, state:ma, region:northeast
temperaturecity:new_york, state:ny, region:northeast
temperaturecity:seattle, state:wa, region:northwest

Suppose you drop the city tag from the temperature metric. This consolidates the data from Orlando and Miami:

Tags0-10 s10-20 s20-30 s30-40 s40-50 s50-60 s
FL, Southeast8181818181.581.5
MA, Northeast787878787879
NY, Northeast797979797979
WA, Northwest757575757575

Now there are four unique tag value combinations that appear in the temperature data. Therefore, the total number of custom metrics is four:

MetricTags
temperaturestate:fl, region:southeast
temperaturestate:ma, region:northeast
temperaturestate:ny, region:northeast
temperaturestate:wa, region:northwest

Distributions

A distribution metric gathers all values across all hosts emitting metric values in ten-second flush intervals. Distributions emit a number of custom metrics that is proportional to the number of custom metrics emitted from GAUGE. Distributions generate five timeseries for each unique tag value combination that appears in the data: sum, count, min, max, and avg (calculated from the sum/count).

Suppose you are interested in measuring the maximum age metric in the state of New York. age is submitted as a distribution metric tagged with city and state:

TagsValues in 10s flush intervalSumCountMinimumMaximumAverage
Rochester, NY23,29,33,55,41,36,12,672968126737
New York, NY18,22,26,31,29,40,23,352158184028

The total number of custom metrics or timeseries emitted from the age distribution metric is ten (5 x 2). For both unique tag value combinations above (Rochester, NY and New York, NY), Datadog stores five timeseries (sum,count,min,max, avg).

To obtain the maximum age in the state of New York, reaggregate the timeseries above: Maximum age in New York = max(max(Rochester, NY), max(New York, NY)) = 67.

Distributions with percentile aggregations

After submitting a distribution metric to Datadog, you have the option to add percentile aggregations to a distribution with the Distributions UI. Distributions with percentile aggregations are counted differently compared to the metric types listed above since percentiles are not mathematically reaggregatable.

Suppose you are interested in measuring the median age in the state of New York where the age distribution metric is tagged with city and state.

TagsValues in 10s flush intervalSumCountMinMaxAvgp50p75p90p95p99
Rochester, NY23,33,55,41,36,12,6626671266383655666666
New York, NY18,26,31,29,40,23,3620371840292936404040

Percentiles are NOT reaggregatable—you can’t reaggregate the same way maximum ages were above. The median age in New York is not equal to the median(median(Rochester, NY), median(New York, NY)).

Therefore to provide you with the statistically valid percentile aggregations, Datadog needs to precalculate five timeseries (p50,p75,p90,p95,p99) for each potentially queryable tag value combination. In the New York example, you have the following potentially queryable tag value combinations for the city and state tags:

city tagstate tag
Rochesterempty
RochesterNY
New Yorkempty
New YorkNY
emptyNY
emptyempty

There are three potentially queryable values for the city tag: {Rochester, New York, empty} and two values for the state tag: {NY, empty}.

Further Reading

Additional helpful documentation, links, and articles: