Generate Metrics from Processes
Incident Management is now generally available! Incident Management is now generally available!

Generate Metrics from Processes

Overview

While Live Processes data is stored for 36 hours, you can generate global and percentile distribution metrics from your processes to monitor your resource consumption long-term. Process-based metrics are stored for 15 months like any other Datadog metric, making it easy to:

  • Debug past and ongoing infrastructure issues
  • Identify trends in the resource consumption of your critical workloads
  • Assess the health of your system before and after load or stress tests
  • Track the effect of software deployments on the health of your underlying hosts or containers

Generating process-based metrics is currently in private beta.

Generate a process-based metric

You can generate a new process-based metric directly from queries in the Live Processes page, or in the Distribution Metrics tab, by selecting the Create Metric button.

Add a new process-based metric

  1. Select tags to filter your query: The query syntax is the same as for Live Processes. Only processes matching the scope of your filters are considered for aggregation. Textsearch filters are currently supported only in the Live Processes page.
  2. Select the measure you would like to track: Enter a measure (e.g., Total CPU %) to aggregate a numeric value and create its corresponding count, min, max, sum, and avg aggregated metrics.
  3. Add tags to group by: Select tags to be added as dimensions to your metrics, so they can be filtered, aggregated, and compared. By default, metrics generated from processes will not have any tags unless explicitly added. Any tag available for Live Processes queries can be used in this field. Process-based metrics are considered custom metrics. Avoid grouping by unbounded or extremely high cardinality tags like command and user to avert impacting your billing.
  4. Name your metric: Fill in the name of your metric. Process-based metrics always have the prefix proc. and suffix [measure_selection].
  5. Add percentile aggregations: Select the Include percentile aggregations checkbox to generate p50, p75, p90, p95, and p99 percentiles. Percentile metrics are also considered customer metrics, and billed accordingly.

You can create multiple metrics using the same query by selecting the Create Another checkbox at the bottom of the metric creation modal. When selected, the modal will remain open after your metric has been created, with the filters and aggregation groups already filled in.

Note: Data points for process-based metrics are generated at ten second intervals. There may be up to a 3-minute delay from the moment the metric is created or updated, to the moment the first data point is reported.

Update a process-based metric

After a metric is created, the following fields can be updated:

  • Filter query: Add or remove tags from the ‘Filter by’ field to change the set of matching processes for which metrics are generated.
  • Aggregation groups: Add or remove tags from the ‘Group by’ field to break down your metrics in different ways, or manage their cardinality.
  • Percentile selection: Check or uncheck the ‘Include percentile aggregations’ box to remove or generate percentile metrics.

To change the metric type or name, a new metric must be created.

Leverage process metrics across the Datadog platform

Once created, you can use process distribution aggregate and percentile metrics like any other in Datadog. For instance:

  • Graph process-based metrics in dashboards and notebooks to track the historical resource consumption of important workloads
  • Create threshold or anomaly-based monitors on top of process-based metrics to detect when CPU or RSS memory dips or spikes unexpectedly
  • Use Metric Correlations to contextualize changes in resource consumption against internal and third-party software performance

Further reading

Additional helpful documentation, links, and articles: