Monitor your log usage

Monitor your log usage

The goal of this guide is to explain how to monitor your log usage thanks to estimated usage metrics. This guide goes through the following steps:

  • Alert on unexpected traffic spikes
  • Alert when you are getting close to a budget threshold on your indexed logs
  • Import the out of the box Log Management usage dashboard

Alert on unexpected spikes

Logs usage metrics

By default, log usage metrics are available to track the number of ingested logs, ingested bytes and indexed logs. Those metrics are free and kept for 15 months:

See below how to leverage them in anomaly detection monitors.

Note: It is recommended to set the unit to Byte for the datadog.estimated_usage.logs.ingested_bytes in the metric summary page:

Anomaly detection monitors

To define anomaly detection monitors to be alerted of any unexpected indexing log spikes:

  1. Create a new Anomaly monitor
  2. Select the datadog.estimated_usage.logs.ingested_events metric
  3. Add datadog_is_excluded:false in the from section (to monitor indexed logs and not ingested ones)
  4. Add the tag service and datadog_index in group by (to be notified if a specific service spikes or stops sending logs in any indexes)
  5. Set the alert condition to match your use case (e.g., evaluation window, number of times outside the expected range, etc.)
  6. Set the notification message with actionable instructions:

Example of a notification with contextual links:

An unexpected amount of logs has been indexed in index {{datadog_index.name}}

1. [Check Log patterns for this service](https://app.datadoghq.com/logs/patterns?from_ts=1582549794112&live=true&to_ts=1582550694112&query=service%3A{{service.name}})
2. [Add an exclusion filter on the noisy pattern](https://app.datadoghq.com/logs/pipelines/indexes)

Estimated usage dashboard

From log usage metrics, an estimated usage Dashboard can also be built to monitor your Log Management usage across Datadog. Here is an example of such a Dashboard:

Reminder: The metrics used in this dashboard are estimates and might differ from official billing numbers.

To import this dashboard, copy the estimated usage dashboard JSON definition and paste it as a new Dashboard. Alternatively use the Import Dashboard JSON option in the settings cog menu in the upper right corner of a new dashboard.

Monitor indexed logs with fixed threshold

Get notified if the indexed log volumes in any scope (service, availability-zone, etc…) of your infrastructure are growing unexpectedly:

  1. Go to the Datadog Log Explorer view.
  2. Build a search query that represents the volume to monitor. Keep the query empty to monitor all the logs from that index.
  3. Click on Export to monitor.
  4. Define the rate you would like to set as warning or error.
  5. Define an explicit notification: The volume on this service just got too high. Define an additional exclusion filter or increase the sampling rate to get it back under control.

Alert on indexes reaching their daily quota

It is also possible to set up a daily quota on indexes to prevent indexing more than a given number of logs per day. When doing this, Datadog recommends that you set the above monitor to alert when 80% of this quota is reached within the past 24 hours. An event is generated when the daily quota is reached. Set up a monitor to be notified when this happens:

Here is an example of what the notification would look like in Slack:

Further Reading