Overview
All events generated by your monitor appear on the monitor’s status page, showing the groups’ name, event type, and timestamp. The Event timeline also includes downtime and audit trail events.
For each event, you can access quick actions and view related assets, like dashboards and logs.
Event details section
To explore each individual event for more information, including associated tags and actions:
- From the monitor status page, scroll down to the Event timeline.
- Click on an event in the timeline to view event details.
Use the event details to understand monitor alerts and identify root causes. This information supports responder workflows and helps you stay informed about ongoing situations.
With Quick Actions, you can take action without leaving the status page. Responders save time since the context is automatically added.
Action | Description |
---|
Mute | Create a downtime to mute monitor alerts. |
Resolve | Temporarily set the monitor status to OK until its next evaluation. |
Declare Incident | Escalate monitor alerts with Incident Management. |
Create Case | Create a case to keep track of this alert investigation without leaving Datadog. |
Run Workflow | Run Workflow Automation with predefined snippets to run mitigation actions. |
Resolve
You can resolve a monitor alert from the status page Header or Event details sections. Resolving from the Event details section only affects the group related to the selected event, while resolving from the Header resolves all groups in the alert and sets the monitor status to OK
(all groups).
If a monitor is alerting because its current data corresponds to the ALERT
state, using resolve
will cause the state to temporarily switch from ALERT
to OK
, and then back to ALERT
. Therefore, resolve
is not meant for acknowledging the alert or instructing Datadog to ignore it.
Manually resolving a monitor is useful when data is reported intermittently. For example, after an alert is triggered, the monitor may stop receiving data, preventing it from evaluating alert conditions and recovering to the OK
state. In such cases, the resolve
function or the Automatically resolve monitor after X hours
changes the monitor back to an OK
state.
Typical use case: A monitor based on error metrics that are not generated when there are no errors (aws.elb.httpcode_elb_5xx
, or any DogStatsD counter in your code reporting an error only when there is an error).
Event troubleshooting section
For each event, access troubleshooting information to help responders quickly understand the context of the alert.
Troubleshooting component | Description |
---|
Dependency Map | When a service tag is available, either as a monitor tag or in the group, you can access a dependency map showing the status of your dependencies. |
Further reading
Additional helpful documentation, links, and articles: