Logging is here!

Custom Metrics

Datadog allows you to submit custom metrics in multiple ways in order to provide a comprehensive view of what is happening in your infrastructure

This article explains:

  • What a custom metric is, and how you can submit it to Datadog.
  • How many custom metrics we allow for out of the box.
  • How to check your custom metric count over time.
  • Some best practices for using custom metrics.

How is a custom metric defined ?

A custom metric refers to a single, unique combination of a metric name, host, and any tags.

Custom metrics generally refer to any metric that you send using statsd, DogStatsD, or through extensions made to the Datadog Agent. Some integrations can potentially emit an unlimited number of metrics that can also count as custom, further details on which standard integrations emit custom metrics.

In order to fully leverage the capabilities of the Datadog product through scoping and alerting, you’ll probably be using tags. As a consequence, one submitted metric actually leads to multiple unique tag combinations- counting towards your custom metrics count.

For example:

  • You submit the following metric name: auth.exceptionCount
  • Your code instrumentation plans the following tags associated with that metric: method:X, method:Y, exception:A, exception:B.
  • The logic behind your metric is the following :
    custom_metric_1

The given unique metrics on a given host are therefore:

  • auth.exceptionCount with tag method:X
  • auth.exceptionCount with tag method:Y
  • auth.exceptionCount with tags method:X and exception:A //unique because of new tag exception:A
  • auth.exceptionCount with tags method:X and exception:B
  • auth.exceptionCount with tags method:Y and exception:A
  • auth.exceptionCount with tags method:Y and exception:B

In this situation, you would end up with 6 different metrics.

Note that the ordering of tags does not matter, so the following two metrics would be considered non-unique:

  • auth.exceptionCount with tags method:X and exception:A
  • auth.exceptionCount with tags exception:A and method:X

How many custom metrics am I allowed?

Datadog offers 2 plans - Pro & Enterprise. Pro customers are allotted 100 custom metrics per host & Enterprise customers are allotted 200 custom metrics per host. These are counted across your entire infrastructure rather than on a per-host basis. For example, if you were on the Pro plan and are licensed for 3 hosts, you would have 300 custom metrics by default - these 300 metrics may be divided equally amongst each individual host, or all 300 metrics could be sent from a single host.

Using the aforementioned example, below shows three scenarios which would all be acceptable without exceeding the default metric count for three hosts:

Custom_Metrics_300
custom-metrics-1

We do not enforce any fixed rate limit on custom metric submission, if you’re exceeding your default allotment, our teams will reach out to you.

How do I check my custom metrics count?

When creating a custom metric, all the host tags are automatically added to that metric as one unique tag combination, to which you’ll add the tags linked to the metric itself. Those are the most important as they add to the actual metric count.

Let’s say you want to have insight into the request.count from different services across your infrastructure.

  • You create your metric service.request.count
  • You want to separate the requests that were successful from the failures. You create two tags to that effect:
    • status:success
    • status:failure
  • You want this metric to be reported by each service running on your infrastructure. Let’s say you have 3 services per host:
    • service:database
    • service:api
    • service:webserver

The logic behind your metric is the following :

logic_metric

From there, you can see that on each host reporting this metric, if all services report both successes and failures, you can have up to 1x2x3 = 6 custom metrics.

Let’s say you have 3 hosts:

  • host1 is reporting all possible configurations
  • host2 is reporting only successes across all services
  • host3 is reporting success and failures, but only for database and webserver services

Across your 3 hosts, you’d have 13 distinct metrics, here is why :

metric_count

If you are an administrator, you can see your total custom metrics per hour as well as the top 500 custom metrics by cardinality in your account in the usage details page. You can also see this metric count on your metric summary page, where you’d see, clicking on the service.request.count metric, the exact number of unique tag combinations:

So if you only had the first host from the example above reporting, you’d have this:

metric_summary

Adding the second host:

metric_summary_2

Adding the third host as per the table above, you get your 13 distinct metrics:

metric_summary_3

Using the query editor, you can also find this using the count: aggregator

metric_aggregator

Ultimately, you’ll have 13 metrics using the following query: count:service.request.count{*}

count_of_metrics

Custom metrics best practices

  • For querying purposes, we encourage you to limit the number of tags applied to 1,000 tags per metric. Going over this amount slows down the graphs in your dashboards due to the increase in cardinality.
  • You can check the number of “distinct metrics” in the metric summary page (click a metric name to see the number of distinct metrics associated). If you need a higher custom metric limit, email us.
  • Additional information about billing and custom metrics.