Data Observability Monitor

Overview

Data Observability monitors use anomaly detection that learns from seasonality, trends, and user feedback to catch delayed data, incomplete loads, and unexpected value changes before they affect downstream dashboards, AI applications, or business decisions. Combined with end-to-end data and code lineage, these monitors help teams detect issues early, assess downstream impact, and route to the right owner.

Data Observability monitors support the following metric types:

Table-level metric types:

Metric typeDescription
FreshnessTracks the time elapsed since a table was last updated.
Row CountTracks the number of rows in a table or view.
Custom SQLTracks a custom metric value returned by a SQL query.

Column-level metric types:

Metric typeDescription
FreshnessTracks the most recent date seen in a datetime column.
UniquenessTracks the percentage of unique values.
NullnessTracks the percentage of null values.
CardinalityTracks the number of distinct values.
Percent ZeroTracks the percentage of values equal to zero.
Percent NegativeTracks the percentage of negative values.
Min / Max / Mean / Sum / Standard DeviationTracks statistical measures across column values.

Datadog collects metrics such as row count and freshness from warehouse system metadata (for example, INFORMATION_SCHEMA) when available. This avoids running a query against your warehouse and reduces compute costs. Not all warehouses expose system metadata. For metrics that cannot be collected from system metadata, the monitor runs a query directly against your warehouse to compute the value.

Data Observability monitors require Quality Monitoring to be set up with at least one supported data warehouse (for example, Snowflake, Databricks, or BigQuery).

Monitor creation

To create a Data Observability monitor in Datadog, navigate to Data Observability > Monitors > New Monitor or Monitors > New Monitor > Data Observability. To view all existing Data Observability monitors, see the Data Observability Monitors page.

Choose data to monitor

First, select whether to monitor the Table or Column level:

Input field for selecting entity type and inputting a query

Then, use the Edit tab to search for tables, views, or columns by typing key:value filters into the search field. The following attributes are available:

FilterExampleDescription
Namename:USERS*Match by name. Supports * wildcards.
Schemaschema:PRODMatch by schema.
Databasedatabase:ANALYTICS_DBMatch by database.
Accountaccount:my_accountMatch by account.

Combine filters with AND or OR, use parentheses to group conditions, and prefix with - to exclude.

Examples:

GoalQuery
All tables in the PROD schema, excluding temp tablesschema:PROD AND -name:TEMP*
All timestamp columnsname:*_AT OR name:*_TIMESTAMP
Tables in either PROD or STAGING for a specific databasedatabase:ANALYTICS_DB AND (schema:PROD OR schema:STAGING)

A single monitor can track up to 5,000 tables, views, or columns. This limit cannot be increased. If your query matches more, split them across multiple monitors.

Switch to the Source tab to see the backing query generated from your selections. The query follows this format:

search for [ENTITY_TYPE] where `[FILTER_CONDITIONS]`

Select your metric type

Choose a metric type based on the data quality signal you want to track. Each monitor tracks one metric type.

The Freshness metric type detects when data has not been updated within an expected time window. Use it to catch stale data before it affects downstream reports or models.

  • Table freshness tracks the time elapsed since the table was last updated. Table freshness is not available for views or for data warehouses that do not provide updated timestamps for tables in system metadata. Use column-level freshness instead.
  • Column freshness tracks the most recent date seen in a datetime column.

The Row Count metric type tracks row count changes in your tables. Use it to detect unexpected drops or spikes in data that could indicate pipeline failures or upstream issues.

Column metric types track column-level metrics to detect data drift or quality degradation. Select from the following:

MetricDescription
UniquenessThe percentage of values in a column that are unique.
NullnessThe percentage of values in a column that are null.
CardinalityThe number of distinct values in a column.
Percent ZeroThe percentage of values in a column that are equal to zero.
Percent NegativeThe percentage of values in a column that are negative.
MinThe minimum of all values in a column.
MaxThe maximum of all values in a column.
MeanThe average of all values in a column.
Standard DeviationThe measure of variation within values in a column.
SumThe sum of all values in a column.
Some column metrics are only available for specific column types. Numeric metrics (Percent Zero, Percent Negative, Min, Max, Mean, Standard Deviation, Sum) require numeric columns.

The Custom SQL metric type tracks a custom metric value returned by a SQL query that you define. Use it when built-in metric types do not cover your use case, such as monitoring business-specific data quality rules.

  1. Select a model type that describes the value returned by your query:
    • Default: The query returns a scalar value. Use this in most cases.
    • Freshness: The query returns the difference (in seconds) between the current time and the last time an event occurred.
    • Percentage: The query returns a percentage value between 0 and 100.
  2. Write a SQL query that returns a single value aliased as dd_value, for example: SELECT COUNT(*) as dd_value FROM ANALYTICS_DB.PROD.ORDERS WHERE STATUS = 'FAILED'
  3. Click Validate to verify your query syntax.

If your SQL query includes a GROUP BY clause, list the grouped columns as a comma-separated list in the Group by field (for example, column_a, column_b). Each group is evaluated independently.

Note: Each Custom SQL monitor counts as an individual monitored table for billing purposes.

Input field for custom SQL monitor creation.

Configure monitor

Detection method

Select a detection method:

  • Anomaly: Alert when the metric deviates from an expected pattern. Threshold values are not required. The anomaly model requires 3 to 7 days to train (including a weekend), depending on how frequently the underlying data updates. During the training period, the monitor does not trigger alerts and will be visualized in blue. After training completes, the monitor will be shown in green when in a normal state and red when in an outlier state.
  • Threshold: Alert when the metric crosses a fixed value. Set the comparison operator (above, above or equal to, below, below or equal to, equal to, or not equal to) and define a Critical threshold (required) and optionally a Warning threshold. For more details, see Configure Monitors.

WHERE clause

Add a WHERE clause to filter the data evaluated by the monitor. This is useful for monitoring specific segments of data or only recent records. For example:

  • created_at >= DATEADD(day, -7, CURRENT_TIMESTAMP()) — only monitor rows from the past week.
  • region = 'US' — only monitor data for a specific region.

Group by

You can add a Group by clause to split a single monitor into multiple groups, each evaluated independently. For example, grouping a row count monitor by a REGION column produces a separate alert for each geography.

Input field for selecting GROUP BY dimensions.

The default limit is 100 groups per monitor. To increase this limit, contact Support.

Monitor schedule

Set how often the monitor evaluates your data:

  • Hourly: The monitor runs every hour.
  • Daily: The monitor runs once per day.

Set alert conditions

Choose an aggregation type:

  • Simple Alert: Send a single notification when any monitored table or column meets the condition.
  • Multi Alert: Send a notification for each group meeting the condition. Customize which dimensions to group by (for example, table, schema, database) to control alert granularity. For example, grouping by schema only sends one alert per schema, bundling all affected tables together to reduce noise.

Example notification

{{#is_alert}}
Data quality issue detected on {{database.name}}.{{schema.name}}.{{table.name}}:
current value {{value}} has breached the threshold of {{threshold}}.
{{/is_alert}}

{{#is_recovery}}
Data quality issue on {{database.name}}.{{schema.name}}.{{table.name}} has recovered.
Current value {{value}} is within the threshold of {{threshold}}.
{{/is_recovery}}
{{#is_alert}}
Anomaly detected on {{database.name}}.{{schema.name}}.{{table.name}}:
observed value {{observed}} is outside the expected range of {{lower_bound}} to {{upper_bound}}
(predicted: {{predicted}}).
{{/is_alert}}

{{#is_recovery}}
{{database.name}}.{{schema.name}}.{{table.name}} has recovered.
Observed value {{observed}} is within the expected range.
{{/is_recovery}}

Example monitors

Detect a significant decrease in row count that could indicate a pipeline failure or missing data.

  1. Select Table > Row Count and choose the target table (for example, ANALYTICS_DB.PROD.EVENTS).
  2. Select Anomaly as the detection method. The monitor triggers when the row count deviates from its historical baseline.

Alert when a critical table has not been updated within the expected time window.

  1. Select Table > Freshness and choose the target table (for example, ANALYTICS_DB.PROD.ORDERS).
  2. Select Threshold as the detection method.
  3. Set the Alert threshold to 6 hours and optionally a Warning threshold at 4 hours.

Detect when a column’s null percentage exceeds normal levels, which may indicate data ingestion issues.

  1. Select Column > Nullness and choose the target column (for example, ANALYTICS_DB.PROD.USERS.EMAIL).
  2. Select Anomaly as the detection method.

Annotate bounds

For monitors using the Anomaly detection method, you can annotate bound ranges to provide feedback and improve the model over time. Unlike infrastructure metrics, data quality metrics are often business-specific, so use annotations to teach the model what behavior is normal for your data.

Hover menu for annotating a monitor bound.

On a monitor’s status page, click Annotate Bounds, select a time range on the chart, and choose one of the following annotations:

AnnotationDescription
ExpectedExpand bounds to include the marked behavior permanently.
Reset for nowMark behavior as OK, but alert if it happens again.
Missed alertContract bounds to alert on this behavior.
IgnoreExclude annotated data when modeling bounds.

Further Reading