Getting Started with Monitors
Overview
A metric monitor provides alerts and notifications if a specific metric is above or below a certain threshold. This page provides instructions for setting up a metric monitor to alert on low disk space.
Prerequisites
Before getting started, you need a Datadog account linked to a host with the Datadog Agent installed. To verify, check your Infrastructure List in Datadog.
Setup
To create a metric monitor in Datadog, use the main navigation: Monitors –> New Monitor –> Metric.
Choose the detection method
When you create a metric monitor, Threshold Alert is automatically selected as the detection method. A threshold alert compares metric values against user-defined thresholds. The goal for this monitor is to alert on a static threshold, so no change is necessary.
Define the metric
To get an alert on low disk space, use the system.disk.in_use
metric from the Disk integration and average the metric over host
and device
:
After this is set, the monitor automatically updates to a Multi Alert
that triggers a separate alert for each host
, device
reporting your metric.
Set alert conditions
According to the Disk integration documentation, system.disk.in_use
is the amount of disk space in use as a fraction of the total. So, when this metric is reporting a value of 0.7
, the device is 70% full.
To alert on low disk space, the monitor should trigger when the metric is above
the threshold. The threshold values are based on your preference. For this metric, values between 0
and 1
are appropriate:
For this example, the other settings in this section are left on the defaults. For more details, see the Metric Monitors documentation.
Say what’s happening
Before a monitor can be saved, it must have a title and message.
Title
The title must be unique for each monitor. Since this is a multi alert monitor, names are available for each group element (host
and device
) with message template variables:
Disk space is low on {{device.name}} / {{host.name}}
Message
Use the message to tell your team how to resolve the issue, for example:
Steps to free up disk space:
1. Remove unused packages
2. Clear APT cache
3. Uninstall unnecessary applications
4. Remove duplicate files
For different messages based on alert vs. warning thresholds, see the Notification documentation.
Notify your team
Use this section to send notifications to your team through Email, Slack, PagerDuty, etc. You can search for team members and connected accounts with the dropdown box. When an @notification
is added to this box, the notification is automatically added to the message box:
Removing the @notification
from either section removes it from both sections.
Permissions
Use this option to restrict the editing of your monitor to its creator and to specific roles in your org. For more information about roles, see Role Based Access Control.
View Monitors and Triage Alerts on Mobile
You can view Monitor Saved Views from your mobile home screen or view and mute monitors by downloading the Datadog Mobile App, available on the Apple App Store and Google Play Store. This helps with triaging when you are away from your laptop or desktop.
Further Reading
Additional helpful documentation, links, and articles: