---
title: Agent Observability
description: >-
  Overview of Agent Observability, a platform for monitoring, troubleshooting,
  and improving LLM applications and AI agents.
breadcrumbs: Docs > Agent Observability
---

> For the complete documentation index, see [llms.txt](https://docs.datadoghq.com/llms.txt).

# Agent Observability

{% callout %}
# Important note for users on the following Datadog sites: app.ddog-gov.com, us2.ddog-gov.com

{% alert level="danger" %}
This product is not supported for your selected [Datadog site](https://docs.datadoghq.com/getting_started/site.md). ({% placeholder "user-datadog-site-name" /%}).
{% /alert %}

{% /callout %}

{% callout %}
##### Try Getting Started with Agent Observability in the Learning Center

Learn how to monitor your LLM application's performance, costs, traces, token usage, and errors to identify and resolve issues.

[ENROLL NOW](https://learn.datadoghq.com/courses/llm-obs-getting-started)
{% /callout %}

## Overview{% #overview %}

With Agent Observability, you can monitor, troubleshoot, and evaluate your LLM-powered applications, such as chatbots. You can investigate the root cause of issues, monitor operational performance, and evaluate the quality, privacy, and safety of your LLM applications.

Each request fulfilled by your application is represented as a trace on the [**Agent Observability** page](https://app.datadoghq.com/llm/traces) in Datadog.

{% image
   source="https://docs.dd-static.net/images/llm_observability/traces.03f0666d1900ffe55652104872112023.png?auto=format&fit=max&w=850 1x, https://docs.dd-static.net/images/llm_observability/traces.03f0666d1900ffe55652104872112023.png?auto=format&fit=max&w=850&dpr=2 2x"
   alt="A list of prompt-response pair traces on the Agent Observability page" /%}

A trace can represent:

- An individual LLM inference, including tokens, error information, and latency
- A predetermined LLM workflow, which is a grouping of LLM calls and their contextual operations, such as tool calls or preprocessing steps
- A dynamic LLM workflow executed by an LLM agent

Each trace contains spans representing each choice made by an agent or each step of a given workflow. A given trace can also include input and output, latency, privacy issues, errors, and more. For more information, see [Terms and Concepts](https://docs.datadoghq.com/llm_observability/terms.md).

## Troubleshoot with end-to-end tracing{% #troubleshoot-with-end-to-end-tracing %}

View every step of your LLM application chains and calls to pinpoint problematic requests and identify the root cause of errors.

{% image
   source="https://docs.dd-static.net/images/llm_observability/errors.f7f6a86f8ea4da499fde4a9d410a56ed.png?auto=format&fit=max&w=850 1x, https://docs.dd-static.net/images/llm_observability/errors.f7f6a86f8ea4da499fde4a9d410a56ed.png?auto=format&fit=max&w=850&dpr=2 2x"
   alt="Errors that occurred in a trace on the Errors tab in a trace side panel" /%}

## Monitor operational metrics and optimize cost{% #monitor-operational-metrics-and-optimize-cost %}

Monitor the cost, latency, performance, and usage trends for all your LLM applications with [out-of-the-box dashboards](https://app.datadoghq.com/dash/integration/llm_operational_insights).

{% image
   source="https://docs.dd-static.net/images/llm_observability/dashboard_1.03b09c80d39dcf44341e6418dc7acd13.png?auto=format&fit=max&w=850 1x, https://docs.dd-static.net/images/llm_observability/dashboard_1.03b09c80d39dcf44341e6418dc7acd13.png?auto=format&fit=max&w=850&dpr=2 2x"
   alt="The out-of-the-box Agent Observability Operational Insights dashboard in Datadog" /%}

## Evaluate the quality and effectiveness of your LLM applications{% #evaluate-the-quality-and-effectiveness-of-your-llm-applications %}

Understand what users are asking your LLM application, identify coverage gaps, and monitor the quality of responses over time with [Patterns](https://docs.datadoghq.com/llm_observability/monitoring/patterns.md) — automated hierarchical topic clustering of your production traffic.

{% image
   source="https://docs.dd-static.net/images/llm_observability/patterns_topic_details.0fb4583b27f22b295bf8a91b615cc0a4.png?auto=format&fit=max&w=850 1x, https://docs.dd-static.net/images/llm_observability/patterns_topic_details.0fb4583b27f22b295bf8a91b615cc0a4.png?auto=format&fit=max&w=850&dpr=2 2x"
   alt="The topic detail view showing a summary of the topic, the total interaction count, and a table of interactions with child topic label, input text, and timestamp." /%}

## Safeguard sensitive data and identify malicious users{% #safeguard-sensitive-data-and-identify-malicious-users %}

Automatically scan and redact any sensitive data in your AI applications and identify prompt injections, among other evaluations.

{% image
   source="https://docs.dd-static.net/images/llm_observability/prompt_injection.f4560358ad3e0f4c563c50aee3483b7e.png?auto=format&fit=max&w=850 1x, https://docs.dd-static.net/images/llm_observability/prompt_injection.f4560358ad3e0f4c563c50aee3483b7e.png?auto=format&fit=max&w=850&dpr=2 2x"
   alt="An example of a prompt-injection attempt detected by Agent Observability" /%}

## See anomalies highlighted as insights{% #see-anomalies-highlighted-as-insights %}

Agent Observability Insights provides a monitoring experience that helps users identify anomalies in their operational metrics—such as duration and error rate—and their [out-of-the-box (OOTB) evaluations](https://docs.datadoghq.com/llm_observability/evaluations/managed_evaluations.md).

Outlier detection is performed across key dimensions:

- Span name
- Workflow type
- [Patterns input/output topics](https://docs.datadoghq.com/llm_observability/monitoring/patterns.md)

These outliers are analyzed over the past week and automatically surfaced in the corresponding time window selected by the user. This enables teams to proactively detect regressions, performance drifts, or unexpected behavior in their LLM applications.

{% image
   source="https://docs.dd-static.net/images/llm_observability/Overview_LLMO.6e45fefe74cdda6a96f0d2bdaeb91555.png?auto=format&fit=max&w=850 1x, https://docs.dd-static.net/images/llm_observability/Overview_LLMO.6e45fefe74cdda6a96f0d2bdaeb91555.png?auto=format&fit=max&w=850&dpr=2 2x"
   alt="An 'Insights' banner across the top of the Agent Observability Monitor page. The banner displays 8 insights and has a View Insights button that leads to a side panel with further details." /%}

## Use integrations with Agent Observability{% #use-integrations-with-agent-observability %}

The [Agent Observability SDK for Python](https://docs.datadoghq.com/llm_observability/setup/sdk.md) integrates with frameworks such as OpenAI, LangChain, AWS Bedrock, and Anthropic. It automatically traces and annotate LLM calls, capturing latency, errors, and token usage metrics—without code changes.

{% alert level="info" %}
Datadog offers a variety of artificial intelligence (AI) and machine learning (ML) capabilities. The [AI/ML integrations on the Integrations page and the Datadog Marketplace](https://docs.datadoghq.com/integrations.md#cat-aiml) are platform-wide Datadog functionalities.  For example, APM offers a native integration with OpenAI for monitoring your OpenAI usage, while Infrastructure Monitoring offers an integration with NVIDIA DCGM Exporter for monitoring compute-intensive AI workloads. These integrations are different from the Agent Observability offering.
{% /alert %}

For more information, see the [Auto Instrumentation documentation](https://docs.datadoghq.com/llm_observability/setup/auto_instrumentation.md).

## Ready to start?{% #ready-to-start %}

See the [Setup documentation](https://docs.datadoghq.com/llm_observability/setup.md) for instructions on instrumenting your LLM application or follow the [Trace an LLM Application guide](https://docs.datadoghq.com/llm_observability/quickstart.md) to generate a trace using the [Agent Observability SDK for Python](https://docs.datadoghq.com/llm_observability/setup/sdk.md).

## Further Reading{% #further-reading %}

- [Tracing LLM Applications](https://learn.datadoghq.com/courses/llm-obs-tracing-llm-applications)
- [Investigate with LLM Observability](https://learn.datadoghq.com/courses/llm-obs-investigations)
- [Evaluate, optimize, and secure your Google Cloud AI stack with Datadog](https://www.datadoghq.com/blog/datadog-google-cloud-ai-stack/)
- [Offline evaluation for AI agents: Best practices](https://www.datadoghq.com/blog/offline-llm-evaluations/)
- [How we built a real-world evaluation platform for autonomous SRE agents at scale](https://www.datadoghq.com/blog/engineering/bits-ai-eval-platform/)
- [Building reliable dashboard agents with Datadog LLM Observability](https://www.datadoghq.com/blog/llm-observability-at-datadog-dashboards)
- [Driving AI ROI: How Datadog connects cost, performance, and infrastructure so you can scale responsibly](https://www.datadoghq.com/blog/manage-ai-cost-and-performance-with-datadog/)
- [Datadog LLM Observability natively supports OpenTelemetry GenAI Semantic Conventions](https://www.datadoghq.com/blog/llm-otel-semantic-convention)
- [Gain visibility into Strands Agents workflows with Datadog LLM Observability](https://www.datadoghq.com/blog/llm-aws-strands)
- [Monitor your Anthropic applications with Datadog LLM Observability](https://www.datadoghq.com/blog/anthropic-integration-datadog-llm-observability/)
- [Best practices for monitoring LLM prompt injection attacks to protect sensitive data](https://www.datadoghq.com/blog/monitor-llm-prompt-injection-attacks/)
- [Optimize LLM application performance with Datadog's vLLM integration](https://www.datadoghq.com/blog/vllm-integration/)
- [Optimize and troubleshoot AI infrastructure with Datadog GPU Monitoring](https://www.datadoghq.com/blog/datadog-gpu-monitoring/)
- [Monitor agents built on Amazon Bedrock with Datadog LLM Observability](https://www.datadoghq.com/blog/llm-observability-bedrock-agents/)
- [Identify common security risks in MCP servers](https://www.datadoghq.com/blog/monitor-mcp-servers/)
- [Abusing AI infrastructure: How mismanaged credentials and resources expose LLM applications](https://www.datadoghq.com/blog/detect-abuse-ai-infrastructure/)
- [How we cut our NLQ agent debugging time from hours to minutes with LLM Observability](https://www.datadoghq.com/blog/llm-observability-at-datadog-nlq)