Overview
Watchdog is an algorithmic feature for APM, Infrastructure, and Logs. It automatically detects potential issues by continuously observing trends and patterns in metrics and logs, and looking for atypical behavior.
Monitor creation
To create a Watchdog monitor in Datadog, use the main navigation: Monitors –> New Monitor –> Watchdog.
Select alert type
In this section, choose between an APM, Infrastructure, or Logs alert:
An APM alert is created when Watchdog detects anomalous behavior on your system’s services or their child resources.
Select sources
Select the scope to be alerted on by configuring:
- The type of Watchdog anomaly: Error Rates, Latency, Hits, or any APM alert
- The value for APM primary tags (see the Set primary tags to scope page for instructions to configure APM primary and second primary tags)
- The APM service (choose
Any services
to monitor all services) - The APM resource of a service (choose
*
to monitor all resources of a service) - The dimensions you want to group notifications by
After your selections are made, the graph at the top of the monitor creation page displays the matching Watchdog events over the selected time frame.
Select sources
Select the scope to be alerted on by configuring:
- The Infrastructure integration to cover (select
Any Infrastructure alert
to cover them all). See the Watchdog overview for a full list of integrations covered by Watchdog infrastructure) - The Tags available for the selected integration
- The dimensions you want to group notifications by
After your selections are made, the graph at the top of the monitor creation page displays the matching Watchdog events over the selected time frame.
A logs alert indicates that either a new pattern of error logs has been detected or that there has been an increase in an existing pattern of error logs.
Select sources
Select the scope to be alerted on by configuring:
- The environment (leave empty to alert on all environments). These values are derived from the
env
tag in your logs - The service (leave empty to alert on all services). These values are derived from the
service
reserved attribute in your logs - The log source (leave empty to alert on all sources). These values are derived from the
source
reserved attribute in your logs - The log status (leave empty to alert on all status). These values are derived from the
status
reserved attribute in your logs - The log anomaly type (
new Error
or Spike in existing logs
) determines whether the anomaly describes a new pattern of error logs or an increase in an existing pattern of error logs - The dimensions you want to group notifications by
After your selections are made, the graph at the top of the monitor creation page displays the matching Watchdog events over the selected time frame.
Notifications
For detailed instructions on the Say what’s happening and Notify your team sections, see the Notifications page.
Further Reading
Additional helpful documentation, links, and articles: