Cette page n'est pas encore disponible en français, sa traduction est en cours. Si vous avez des questions ou des retours sur notre projet de traduction actuel, n'hésitez pas à nous contacter.
To deploy a Datadog monitor, you can use the Datadog Operator and DatadogMonitor custom resource definition (CRD).
Create a file with the spec of your DatadogMonitor deployment configuration.
Example:
The following spec creates a metric monitor that alerts on the query avg(last_10m):avg:system.disk.in_use{*} by {host} > 0.5.
datadog-metric-monitor.yaml
apiVersion:datadoghq.com/v1alpha1kind:DatadogMonitormetadata:name:datadog-monitor-testnamespace:datadogspec:query:"avg(last_10m):avg:system.disk.in_use{*} by {host} > 0.5"type:"metric alert"name:"Test monitor made from DatadogMonitor"message:"1-2-3 testing"tags:- "test:datadog"priority:5controllerOptions:disableRequiredTags:falseoptions:evaluationDelay:300includeTags:truelocked:falsenewGroupDelay:300notifyNoData:truenoDataTimeframe:30renotifyInterval:1440thresholds:critical:"0.5"warning:"0.28"
The following table lists all available configuration fields for the DatadogMonitor custom resource.
message
required - string A message to include with notifications for this monitor.
name
required - string The monitor name.
query
required - string The monitor query.
type
required - enum The type of the monitor. Allowed enum values: metric alert, query alert, service check, event alert, log alert, process alert, rum alert, trace-analytics alert, slo alert, event-v2 alert, audit alert, composite
controllerOptions.disableRequiredTags
boolean Disables the automatic addition of required tags to monitors.
priority
int64 An integer from 1 (high) to 5 (low) indicating alert severity.
restrictedRoles
[string] A list of unique role identifiers to define which roles are allowed to edit the monitor. The unique identifiers for all roles can be pulled from the Roles API and are located in the data.id field.
tags
[string] Tags associated to your monitor.
options
object List of options associated with your monitor. See Options.
Options
The following fields are set in the options property.
For example:
apiVersion:datadoghq.com/v1alpha1kind:DatadogMonitormetadata:name:datadog-monitor-testnamespace:datadogspec:query:"avg(last_10m):avg:system.disk.in_use{*} by {host} > 0.5"type:"metric alert"name:"Test monitor made from DatadogMonitor"message:"1-2-3 testing"options:enableLogsSample:truethresholds:critical:"0.5"warning:"0.28"
enableLogsSample
boolean Whether or not to send a log sample when the log monitor triggers.
escalationMessage
string A message to include with a re-notification.
evaluationDelay
int64 Time (in seconds) to delay evaluation, as a non-negative integer. For example: if the value is set to 300 (5min), the timeframe is set to last_5m, and the time is 7:00, then the monitor evaluates data from 6:50 to 6:55. This is useful for AWS CloudWatch and other backfilled metrics to ensure the monitor always has data during evaluation.
groupRetentionDuration
string The time span after which groups with missing data are dropped from the monitor state. The minimum value is one hour, and the maximum value is 72 hours. Example values are: 60m, 1h, and 2d. This option is only available for APM Trace Analytics, Audit Trail, CI, Error Tracking, Event, Logs, and RUM monitors.
groupbySimpleMonitor
boolean DEPRECATED: Whether the log alert monitor triggers a single alert or multiple alerts when any group breaches a threshold. Use notifyBy instead.
includeTags
boolean A Boolean indicating whether notifications from this monitor automatically inserts its triggering tags into the title.
locked
boolean DEPRECATED: Whether or not the monitor is locked (only editable by creator and admins). Use restrictedRoles instead.
newGroupDelay
int64 Time (in seconds) to allow a host to boot and applications to fully start before starting the evaluation of monitor results. Should be a non-negative integer.
noDataTimeframe
int64 The number of minutes before a monitor notifies after data stops reporting. Datadog recommends at least 2x the monitor timeframe for metric alerts or 2 minutes for service checks. If omitted, 2x the evaluation timeframe is used for metric alerts, and 24 hours is used for service checks.
notificationPresetName
enum Toggles the display of additional content sent in the monitor notification. Allowed enum values: show_all, hide_query, hide_handles, hide_all Default: show_all
notifyAudit
boolean A Boolean indicating whether tagged users are notified on changes to this monitor.
notifyBy
[string] A string indicating the granularity a monitor alerts on. Only available for monitors with groupings. For example, if you have a monitor grouped by cluster, namespace, and pod, and you set notifyBy to ["cluster"], then your monitor only notifies on each new cluster violating the alert conditions. Tags mentioned in notifyBy must be a subset of the grouping tags in the query. For example, a query grouped by cluster and namespace cannot notify on region. Setting notifyBy to [*] configures the monitor to notify as a simple-alert.
notifyNoData
boolean A Boolean indicating whether this monitor notifies when data stops reporting. Default: false.
onMissingData
enum Controls how groups or monitors are treated if an evaluation does not return any data points. The default option results in different behavior depending on the monitor query type. For monitors using Count queries, an empty monitor evaluation is treated as 0 and is compared to the threshold conditions. For monitors using any query type other than Count, for example Gauge, Measure, or Rate, the monitor shows the last known status. This option is only available for APM Trace Analytics, Audit Trail, CI, Error Tracking, Event, Logs, and RUM monitors. Allowed enum values: default, show_no_data, show_and_notify_no_data, resolve
renotifyInterval
int64 The number of minutes after the last notification before a monitor re-notifies on the current status. It only re-notifies if it’s not resolved.
renotifyOccurrences
int64 The number of times re-notification messages should be sent on the current status at the provided re-notification interval.
renotifyStatuses
[string] The types of monitor statuses for which re-notification messages are sent. If renotifyInterval is null, defaults to null. If renotifyInterval is not null, defaults to ["Alert", "No Data"] Values for monitor status: Alert, No Data, Warn
requireFullWindow
boolean A Boolean indicating whether this monitor needs a full window of data before it’s evaluated. Datadog highly recommends you set this to false for sparse metrics, otherwise some evaluations are skipped. Default: false.
schedulingOptions
object Configuration options for scheduling:
customSchedule
object Configuration options for the custom schedule:
recurrence
[object] Array of custom schedule recurrences.
rrule
string The recurrence rule in iCalendar format. For example, FREQ=MONTHLY;BYMONTHDAY=28,29,30,31;BYSETPOS=-1.
start
string The start date of the recurrence rule defined in YYYY-MM-DDThh:mm:ss format. If omitted, the monitor creation time is used.
timezone
string The timezone in tz database format, in which the recurrence rule is defined. For example, America/New_York or UTC.
evaluationWindow
object Configuration options for the evaluation window. If hour_starts is set, no other fields may be set. Otherwise, day_starts and month_starts must be set together.
dayStarts
string The time of the day at which a one day cumulative evaluation window starts. Must be defined in UTC time in HH:mm format.
hourStarts
integer The minute of the hour at which a one hour cumulative evaluation window starts.
monthStarts
integer The day of the month at which a one month cumulative evaluation window starts.
thresholdWindows
object Alerting time window options:
recoveryWindow
string Describes how long an anomalous metric must be normal before the alert recovers.
triggerWindow
string Describes how long a metric must be anomalous before an alert triggers.
thresholds
object List of the different monitor thresholds available:
critical
string The monitor CRITICAL threshold.
criticalRecovery
string The monitor CRITICAL recovery threshold.
ok
string The monitor OK threshold.
unknown
string The monitor UNKNOWN threshold.
warning
string The monitor WARNING threshold.
warningRecovery
string The monitor WARNING recovery threshold.
timeoutH
int64 The number of hours of the monitor not reporting data before it automatically resolves from a triggered state.
Further reading
Documentation, liens et articles supplémentaires utiles: