Watchdog is an algorithmic feature for APM that automatically detects potential application and infrastructure issues. Watchdog observes trends and patterns in application metrics—like error rate, request rate, and latency—and unexpected behavior. Watchdog evaluates all services and resources without the need to configure a monitor for each service.
Watchdog looks for irregularities in metrics, like a sudden spike in hit rate. For each irregularity, the Watchdog page displays a Watchdog story. Each story includes a graph of the detected metric irregularity and gives more information about the relevant timeframe and endpoint or endpoints. To avoid false alarms, Watchdog only reports issues after observing your data for a sufficient amount of time to establish a high degree of confidence.
Stories can be filtered by environment and availability zone, as well as by the type of service or resource. Typing in the “Filter stories” search box also allows user to filter stories by service or resource name.
Clicking on the story shows further details about requests, errors, and latency at the time of the detected irregularity.
Selecting Show expected bounds in the corner reveals upper and lower thresholds of expected behavior on the graph.
Use the date picker in the upper right to view stories detected in a specific time range. You can view any story that happened in the last 13 months, going back to March 2019.
Use the eye icon in the upper-right of a story to archive it. Archiving hides the story from the feed, as well as other places in the app, like the home page. If a story is archived, the yellow Watchdog binoculars icon does not show up next to the relevant service or resource.
To see archived stories, select the checkbox option to “Show N archived stories” in the top left. You can also see who archived each story and when, and restore archived stories to your feed.
Archiving does not prevent Watchdog from flagging future issues related to the service or resource.
Facets are listed in the left panel. Use these to filter Watchdog stories by different categories (e.g.
availability zone, etc.) and see the number of stories in each facet category.
When an irregularity in a metric is detected, the yellow Watchdog binoculars icon appears next to the affected service in the APM Services List. The number next to the binoculars indicates the number of issues Watchdog has noticed within that service.
If Watchdog has discovered something out of the ordinary in a specific service, viewing the corresponding Service page reveals a dedicated Watchdog section in the middle of the page, between the application performance graphs and the latency distribution section. The Watchdog section displays any relevant Watchdog Stories.