Error Tracking Monitor

문서 > 모니터링 > Monitor Types > Error Tracking Monitor

이 페이지는 아직 영어로 제공되지 않습니다. 번역 작업 중입니다.
현재 번역 프로젝트에 대한 질문이나 피드백이 있으신 경우 언제든지 연락주시기 바랍니다.

Overview

Datadog Error Tracking automatically groups all your errors into issues across your web, mobile, and backend applications. Viewing errors grouped into issues helps you prioritize and find the problems that are most impactful, making it easier to minimize service downtimes and reduce user frustration.

With Error Tracking enabled for your organization, you can create an Error Tracking monitor to alert you when an issue in your web or mobile application, backend service, or logs is new, when it has a high impact, and when it starts regressing.

Create an Error Tracking monitor

To create an Error Tracking monitor in Datadog, navigate to Monitors > New Monitor > Error Tracking.

There is a default limit of 1000 Error Tracking monitors per account. Contact Support to increase this limit for your account.

Select the alerting condition

There are two types of alerting conditions you can configure your Error Tracking monitor with:

Alerting condition	Description
New Issue	Alert when an issue occurs for the first time or a regression occurs. For example, alert for your service whenever more than 2 users are impacted by a new error.
High Impact	Alert on issues with a high number of impacted end users. For example, alert for your service whenever more than 500 users are impacted by this error.

Define alert conditions

Issues to alert on

New issue monitors alert on issues that are in the For Review state and meet your alerting conditions. Regressions are automatically transitioned to the For Review state, so they are monitored by default with New Issue monitors. For more information on states, see Issue States.

Select All, Browser, Mobile, or Backend issues and construct a search query using the same logic as the Error Tracking Explorer search for the issues’ error occurrences.

New Issue monitors only consider issues that were created or regressed after the monitor was created or last edited. These monitors have a 24-hour lookback period.

Define alert threshold

Choose one of the following options:

Alert on all new issues

Monitor triggers when any new issue is detected (the number of errors is greater than 0 over the past day).

Define your alert metric

Choose the metric you want to monitor. There are three suggested filter options to access the most frequently used facets:
- Error Occurrences: Triggers when the error count is above.
- Impacted Users: Triggers when the number of impacted user emails is above.
- Impacted Sessions: Triggers when the number of impacted session IDs is above.
If you select All or Backend issues, only the Error Occurrences option is available.
You can also specify a custom measure you want to use to monitor. If you select a custom measure, the monitor alerts when the count of unique values of the facet is above.
Have a notification for each issue that matches your query, and group the results by any other attribute you require (for example, have a notification for each issue matching the query, and on each environment).
Query data over the last day (by default) or any other time window at each evaluation.
Choose a threshold for the monitor to trigger (by default 0-triggers at the first occurrence).

Programmatic management

If you are using Terraform or custom scripts using our public APIs to manage your monitors, you need to specify some clauses in the monitor query:

Add the source you want to target between All, Browser, Mobile, and Backend issue. Use the .source() clause with "all", "browser", "mobile" or "backend" right after your filter. Note: you can only use one at a time.
Make sure to use the .new() clause for new issue monitors.

Example:

error-tracking("{filter}").source("backend").new().rollup("count").by("@issue.id").last("1d") > 0

Issues to alert on

High Impact monitors alert on issues that are For Review or Reviewed and that meet your alerting conditions. Read more about Issue States.

Select All, Browser, Mobile, or Backend issues and construct a search query using the same logic as the Error Tracking Explorer search for the issues’ error occurrences.

Define alert threshold

Choose the metric you want to monitor. There are three suggested filter options to access the most frequently used facets:
- Error Occurrences: Triggers when the error count is above.
- Impacted Users: Triggers when the number of impacted user emails is above.
- Impacted Sessions: Triggers when the number of impacted session IDs is above.
If you select All or Backend issues, only the Error Occurrences option is available.
You can also specify a custom measure you want to use to monitor. If you select a custom measure, the monitor alerts when the count of unique values of the facet is above.
Have a notification for each issue that matches your query, and group the results by any other attribute you require (For example, have a notification for each issue matching the query, on each environment).
Query data over the last day (by default) or any other time window at each evaluation.
Choose a threshold for the monitor to trigger (by default 0-triggers at the first occurrence).

Programmatic management

If you are using Terraform or custom scripts using our public APIs to manage your monitors, you need to specify some clauses in the monitor query:

Add the source you want to target between All, Browser, Mobile, and Backend issue. Use the .source() clause with "all", "browser", "mobile" or "backend" right after your filter. Note: you can only use one at a time.
Make sure to use the .impact() clause for high impact monitors.

Example:

error-tracking("{filter}").source("browser").impact().rollup("count").by("@issue.id").last("1d") > 0

Notifications

To display triggering tags in the notification title, click Include triggering tags in notification title.

In addition to matching attribute variables, the following Error Tracking specific variables are available for alert message notifications:

{{issue.attributes.error.type}}
{{issue.attributes.error.message}}
{{issue.attributes.error.stack}}
{{issue.attributes.error.file}}
{{issue.attributes.error.is_crash}}
{{issue.attributes.error.category}}
{{issue.attributes.error.handling}}

For more information about the Configure notifications and automations section, see Notifications.

Select multi alert to receive a notification per issue. This is the intended experience for Error Tracking monitors.

Muting monitors

Error Tracking monitors use Issue States to ensure that your alerts stay focused on high-priority matters, reducing distractions from non-critical issues.

Ignored issues are errors requiring no additional investigation or action. By marking issues as Ignored, these issues are automatically muted from monitor notifications.

Troubleshooting

New Issue monitors do not take into account issue age

issue.age and issue.regression.age are not added by default because they can cause missed alerts. For instance, if an issue first appears in env:staging and then a week later appears in env:prod for the first time, the issue would be considered a week old and wouldn’t trigger an alert in env:prod for the first time.

As a result, Datadog does not recommend using issue.age and issue.regression.age. However, If state-based monitor behavior is not suitable for you, these filters can still be used if manually specified.

Note: If you plan to use issue.age and issue.regression.age in your monitor, this filter key is not consistent across products. For example, it could be @issue.age or issue.age.

New Issue monitors are generating too much noise

New Issue monitors trigger alerts on issues marked For Review that meet your alerting criteria. If issues are not properly triaged (marked as Reviewed, Ignored, or Resolved), a New Issue monitor may trigger more than once for the same issue if the issue fluctuates between OK and ALERT states.

If your monitors are generating too much noise, consider the following adjustments:

Triage your alerts: Set issues to Reviewed, Ignored, or Resolved when appropriate
Expand the evaluation time window: The default evaluation window is 1 day. If errors occur infrequently (for example, every other day), the monitor may switch between OK and ALERT states. Expanding the window helps prevent re-triggering and keeps the monitor in the ALERT state.
Increase the alerting threshold: The default threshold is set to 0, meaning alerts fire on the first occurrence of a new issue. To reduce noise from one-off or sporadic errors, increase the threshold to alert only after multiple occurrences of an error