Error Budget Alerts

Error Budget Alerts

This feature is in open beta. Email slo-help@datadoghq.com to ask questions or to provide feedback on this feature.

Overview

SLO error budget alerts are threshold based and notify you when a certain percentage of your SLO’s error budget has been consumed. For example, alert me if 75% of the error budget for my 7-day target is consumed. Warn me if 50% is consumed (optional).

Note: Error budget alerts are only available for metric-based SLOs or for monitor-based SLOs that are only composed of Metric Monitor types (Metric, Integration, APM Metric, Anomaly, Forecast, or Outlier Monitors).

Monitor creation

  1. Navigate to the SLO status page.
  2. Create a new SLO or edit an existing one, then click the Save and Set Alert button. For existing SLOs, you can also click the Set up Alerts button in the SLO detail side panel to take you directly to the alert configuration.
  3. Select the Error Budget tab in Step 1: Setting alerting conditions.
  4. Set an alert to trigger when the percentage of the error budget consumed is above the threshold. over the past target number of days.
  5. Add Notification information into the Say what’s happening and Notify your team sections.
  6. Click the ‘Save and Set Alert’ button on the SLO configuration page.

API and Terraform

You can create SLO error budget alerts using the create-monitor API endpoint. Below is an example query for an SLO monitor, which alerts when more than 75% of the error budget of an SLO is consumed. Replace slo_id with the alphanumeric ID of the SLO you wish to configure a burn rate alert on and replace time_window with one of 7d, 30d or 90d - depending on which target is used to configure your SLO:

error_budget("slo_id").over("time_window") > 75

In addition, SLO error budget alerts can also be created using the datadog_monitor resource in Terraform. Below is an example .tf for configuring an error budget alert for a metric-based SLO using the same example query as above.

Note: SLO error budget alerts are only supported in Terraform provider v2.7.0 or earlier and in provider v2.13.0 or later. Versions between v2.7.0 and v2.13.0 are not supported.

resource "datadog_monitor" "metric-based-slo" {
    name = "SLO Error Budget Alert Example"
    type  = "slo alert"
    
    query = <<EOT
    error_budget("slo_id").over("time_window") > 75 
    EOT

    message = "Example monitor message"
    monitor_thresholds = {
      critical = 75
    }
    tags = ["foo:bar", "baz"]
}

Beta restrictions

  • Alerting is available only for metric-based SLOs or for monitor-based SLOs that are only composed of Metric Monitor types (Metric, Integration, APM Metric, Anomaly, Forecast, or Outlier Monitors).
  • The alert status of an SLO monitor is available in the Alerts tab in the SLO’s detail panel or the Manage Monitors page.