Data Observability is in Preview.
Data Observability helps data teams detect, resolve, and prevent issues that impact data quality, performance, and cost. It enables teams to monitor anomalies, troubleshoot faster, and maintain trust in the data powering downstream systems.
Datadog makes this possible by monitoring key signals across your data stack, including metrics, metadata, lineage, and logs. These signals help detect issues early and support reliable, high-quality data.
Key capabilities
With Data Observability, you can:
- Detect anomalies in volume, freshness, null rates, and distributions
- Analyze lineage to trace data dependencies from source to dashboard
- Integrate with pipelines to correlate issues with job runs, data streams, and infrastructure events
Monitor data quality
Datadog continuously tracks metrics and metadata, including:
- Data metrics such as null count, null percentage, uniqueness, mean, and standard deviation
- Metadata such as schema, row count, and freshness
You can configure static thresholds or rely on automatic anomaly detection to identify unexpected changes, including:
- Missing or delayed updates
- Unexpected row count changes
- Outliers in key metrics
Trace lineage and understand impact
Data Observability provides end-to-end lineage, helping you:
- Visualize dependencies between tables, columns, and dashboards
- Identify upstream root causes and assess downstream impact
- Debug faster and make safer schema or transformation changes
Correlate with pipeline and infrastructure activity
Understand how pipeline activity and infrastructure events impact your data. Datadog ingests logs and metadata from pipeline tools and user interactions to provide context for data quality issues, including:
- Job failures or delays (for example, Spark, Airflow)
- Query activity and dashboard usage (for example, Tableau)
This operational context helps you trace the source of data incidents and respond faster.
Further reading
Additional helpful documentation, links, and articles: