Overview
Datadog Watchdog constantly runs in the background, scanning for anomalies in your organization’s entire data set. As you navigate the Datadog UI, Watchdog Insights displays a filtered and sorted-by-priority list of anomalies matching your active search query.
Investigating an incident requires trial and error. Drawing from their experience, engineers familiar with a particular area know where to first look for potential problems. Using Watchdog Insights allows all engineers, including less experienced ones, to pay attention to the most important data and accelerate their incident investigations.
Types of anomalies
Each insight highlights one outlier or anomaly affecting a subset of users. Depending on the product area, Watchdog Insights displays different types of anomalies. Examples include, but are not limited to, the following:
- Error and latency outliers in logs, traces, and RUM views
- Spike in error logs
- New error logs
- Deadlocked threads
- High percentage of unready Kubernetes pods
Prioritization
Watchdog sorts insights based on a combination of factors to place the most important insight at the beginning of the list. The factors that Watchdog takes into account can include the following:
- State (ongoing versus resolved)
- Status (warning, error, or critical)
- Start time
- Anomaly type
Usage
The Watchdog Insights banner sits near the top of each page. Expand the banner for an overview. The highest priority insights appear on the left. If Watchdog cannot find any issues, the banner is gray.
Filter on Insight
To refine your current view to match a Watchdog Insight, hover over the top right corner of an insight summary card. Two icons appear. Click on the inverted triangle icon with the tooltip Filter on Insight. The page refreshes to show a list of entries corresponding to the insight.
Side panel
Click View all to expand the panel. A side panel opens from the right, containing a vertical list of Watchdog Insights. Each entry shows a detailed view, with more information than the summary card.
Detailed view
For a detailed view of an insight, click on the individual card. The full side panel opens from the right.
To share an insight in one click, click the Copy Link button on the full side panel. Your clipboard populates with the query that produced the insight.
Explore Watchdog Insights
You can find Watchdog Insights in four product areas: Infrastructure, APM, Log Management, and RUM.
Infrastructure
Live Containers
Watchdog Insights appear in the Kubernetes Explorer tab in Live Containers.
- In the left navigation, hover over Infrastructure.
- Click Kubernetes.
- Select the Explorer tab at the top of the page.
- Choose one of the Kubernetes resource types in the Select Resources box.
- A list of your Kubernetes resources appears, with the Watchdog Insights panel at the top.
Live Processes
Watchdog Insights appear in Live Processes.
- In the left navigation, hover over Infrastructure.
- Click Processes.
A list of your Processes and associated data appears, with the Watchdog Insights panel at the top.
Serverless
For serverless infrastructures, Watchdog surfaces the following insights:
Cold Start Ratio Up/Down
Error Invocation Ratio Up/Down
Memory Usage Up/Down
OOM Ratio Up/Down
Estimated Cost Up/Down
Init Duration Up/Down
Runtime Duration Up/Down
APM
Watchdog Insights appear on several pages within APM:
Log Management
To locate Watchdog Insights in the Log Management UI, take the following steps:
- In the left navigation, hover over Logs.
- Click Search.
The pink Watchdog Insights banner appears in the middle of your screen, above your logs.
For more information, see Watchdog Insights for Logs.
RUM
To locate Watchdog Insights in the RUM UI, take the following steps:
- In the left navigation, hover over UX Monitoring.
- Click Sessions & Replays.
- At the top of the page, the In dropdown shows that you are in the Sessions level. Change the dropdown option to Views.
The pink Watchdog Insights banner appears in the middle of your screen, above your views.
For more information, see Watchdog Insights for RUM.
Further reading
Additional helpful documentation, links, and articles: