---
title: LLM Observability
description: Datadog, the leading service for cloud-scale monitoring.
breadcrumbs: Docs > DDSQL Reference > Data Directory > LLM Observability
---

# LLM Observability

This dataset represents LLM Observability events collected by Datadog. It provides per-span visibility into LLM applications, including request/response payloads, token usage, costs, evaluation outcomes, and experiment metadata. This enables analysis of LLM performance, quality, and cost across projects, experiments, models, and applications.

```
dd.llm_observability
```
LLM Observability Public Documentation 
{% icon name="icon-external-link" /%}
 Monitor LLM Public Documentation 
{% icon name="icon-external-link" /%}
 LLM Observability Experiments Public Documentation 
{% icon name="icon-external-link" /%}
 
## Query Parameters

This dataset uses a **polymorphic table function**. You must specify parameters when querying.

| Parameter        | Type            | Required | Description                                                                                                                      |
| ---------------- | --------------- | -------- | -------------------------------------------------------------------------------------------------------------------------------- |
| `columns`        | `array<string>` | Yes      | List of fields to return for each LLM span (e.g., 'timestamp', '@ml_app', '@metrics.total_tokens').                              |
| `scope`          | `string`        | No       | Optional scope filter to constrain which LLM telemetry scope is queried (e.g., scope => 'experiments' or scope => 'production'). |
| `event_type`     | `string`        | No       | Optional event type selector for the underlying telemetry (e.g., event_type => 'span' or event_type => 'evaluation').            |
| `filter`         | `string`        | No       | Optional EVP search string. For example: filter => '@ml_app:some_app_name AND @status:error'.                                    |
| `from_timestamp` | `string`        | No       | Lower time bound for the query; defaults to query context if omitted.                                                            |
| `to_timestamp`   | `string`        | No       | Upper time bound for the query; defaults to query context if omitted.                                                            |

## Example Queries

```sql
-- Analyze token usage and duration for Anthropic spans
SELECT * FROM dd.llm_observability(
  columns => ARRAY[
    'discovery_timestamp',
    '@ml_app',
    '@name',
    '@status',
    '@meta.model_name',
    '@meta.model_provider',
    '@metrics.input_tokens',
    '@metrics.output_tokens',
    '@metrics.total_tokens',
    '@duration'
  ],
  event_type => 'span',
  filter => '@meta.model_provider:anthropic'
) AS (
  ts TIMESTAMP,
  ml_app VARCHAR,
  span_name VARCHAR,
  span_status VARCHAR,
  model_name VARCHAR,
  model_provider VARCHAR,
  input_tokens BIGINT,
  output_tokens BIGINT,
  total_tokens BIGINT,
  duration BIGINT);
```

```sql
-- Analyze evaluation metrics for an experiment
SELECT * FROM dd.llm_observability(
  columns => ARRAY[
  'discovery_timestamp',
  '@span_id',
  '@meta.input.job_title',
  '@meta.output.persona',
  'dataset_record_id',
  '@evaluation.external.exact_match.value',
  'experiment_id'],
  scope => 'experiments',
  event_type => 'spans',
  filter => 'experiment_id:some_experiment_id',
  from_timestamp => TIMESTAMP '2026-01-01 00:00:00.000+00:00',
  to_timestamp   => TIMESTAMP '2026-01-05 00:00:00.000+00:00'
  ) AS (
  ts TIMESTAMP,
  span_id VARCHAR,
  job_title VARCHAR,
  persona VARCHAR,
  dataset_record_id VARCHAR,
  exact_match BOOLEAN,
  experiment_id VARCHAR
);
```

## Fields

| Title                                 | ID                                                  | Type            | Data Type     | Description                                                                                                                                                       |
| ------------------------------------- | --------------------------------------------------- | --------------- | ------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Timestamp                             | timestamp                                           | core            | timestamp     | The time when the event occurred, as reported by the source (milliseconds since Unix epoch).                                                                      |
| Source                                | source                                              | core            | string        | Source of the event (e.g., integration, k9-saist). Applies to: span, evaluation.                                                                                  |
| Event Status                          | status                                              | core            | string        | Top-level event status (e.g., info, ok). Applies to: span, evaluation.                                                                                            |
| Environment                           | env                                                 | core            | string        | Environment associated with the event (e.g., staging, prod). Applies to: span.                                                                                    |
| Service                               | service                                             | core            | string        | Service associated with the event (e.g., nlq_translation, hallucination_demo). Applies to: span.                                                                  |
| Org ID                                | org_id                                              | core            | int64         | Organization identifier associated with this event when tagged (e.g., 2). Applies to: span, evaluation.                                                           |
| Trace ID                              | @trace_id                                           | event_attribute | string        | Unique identifier of the trace this span belongs to (e.g., '6045057188986015289'). Applies to: span, evaluation.                                                  |
| Span ID                               | @span_id                                            | event_attribute | string        | Unique identifier for the LLM span (for span events: the span itself; for evaluation events: the evaluated span). Applies to: span, evaluation.                   |
| Parent Span ID                        | parent_id                                           | core            | string        | Identifier of the parent span when present (e.g., 'undefined'). Applies to: span.                                                                                 |
| Event Type                            | @event_type                                         | event_attribute | string        | Type of event within LLM Observability ('span' or 'evaluation').                                                                                                  |
| ML App                                | @ml_app                                             | event_attribute | string        | Name of the LLM/ML application emitting the span (e.g., assistant_evaluation). Applies to: span.                                                                  |
| Span Name                             | @name                                               | event_attribute | string        | Logical span name, often mapped to an experiment or evaluation ID (e.g., 'fetch_one'). Applies to: span.                                                          |
| Span Start Time (ns)                  | @start_ns                                           | event_attribute | int64         | Start time of the LLM span in nanoseconds since epoch (e.g., 1762595357539309600). Applies to: span.                                                              |
| Span Duration                         | @duration                                           | event_attribute | int64         | Duration of the LLM span (e.g., 106438139). Applies to: span.                                                                                                     |
| Metric Source                         | @metric_source                                      | event_attribute | string        | Origin of the metric or signal within LLM Observability (e.g., custom, summary). Applies to: evaluation.                                                          |
| Label                                 | @label                                              | event_attribute | string        | Label or name of the evaluation metric (e.g., 'recall_at_k', 'theoretical_row_recall_average_pairs'). Applies to: evaluation.                                     |
| Language                              | language                                            | core            | string        | Language associated with the span (e.g., 'python', 'go'). Applies to: span.                                                                                       |
| Input Tokens                          | @metrics.input_tokens                               | event_attribute | int64         | Number of input tokens for the LLM request (e.g., 24600). Applies to: span.                                                                                       |
| Non-Cached Input Tokens               | @metrics.non_cached_input_tokens                    | event_attribute | int64         | Number of input tokens that were not served from cache (e.g., 19000). Applies to: span.                                                                           |
| Output Tokens                         | @metrics.output_tokens                              | event_attribute | int64         | Number of tokens in the LLM response (e.g., 150). Applies to: span.                                                                                               |
| Total Tokens                          | @metrics.total_tokens                               | event_attribute | int64         | Total tokens accounted for the span. Applies to: span.                                                                                                            |
| Estimated Input Cost                  | @metrics.estimated_input_cost                       | event_attribute | int64         | Estimated cost attributed to input tokens (e.g., 12637000). Applies to: span.                                                                                     |
| Estimated Output Cost                 | @metrics.estimated_output_cost                      | event_attribute | int64         | Estimated cost attributed to output tokens. Applies to: span.                                                                                                     |
| Estimated Non-Cached Input Cost       | @metrics.estimated_non_cached_input_cost            | event_attribute | int64         | Estimated cost attributed to non-cached input tokens. Applies to: span.                                                                                           |
| Estimated Total Cost                  | @metrics.estimated_total_cost                       | event_attribute | int64         | Estimated total LLM cost for this span. Applies to: span.                                                                                                         |
| Cache Read Input Tokens               | @metrics.cache_read_input_tokens                    | event_attribute | int64         | Number of input tokens read from cache. Applies to: span.                                                                                                         |
| Cache Write Input Tokens              | @metrics.cache_write_input_tokens                   | event_attribute | int64         | Number of input tokens written to cache. Applies to: span.                                                                                                        |
| Estimated Cache Read Input Cost       | @metrics.estimated_cache_read_input_cost            | event_attribute | int64         | Estimated cost attributed to cached input token reads. Applies to: span.                                                                                          |
| Estimated Cache Write Input Cost      | @metrics.estimated_cache_write_input_cost           | event_attribute | int64         | Estimated cost attributed to cached input token writes. Applies to: span.                                                                                         |
| Num Evaluations Failed                | @metrics.num_evaluations_failed                     | event_attribute | int64         | Number of managed/custom evaluations that failed on this span. Applies to: span.                                                                                  |
| Num Evaluations Passed                | @metrics.num_evaluations_passed                     | event_attribute | int64         | Number of managed/custom evaluations that passed on this span. Applies to: span.                                                                                  |
| Num Evaluations Without Assessment    | @metrics.num_evaluations_without_assessment         | event_attribute | int64         | Number of evaluations executed without an assessment result. Applies to: span.                                                                                    |
| Model Name                            | @meta.model_name                                    | event_attribute | string        | Name of the model used for this span (e.g., gpt-5, gpt-4.1). Applies to: span.                                                                                    |
| Model Provider                        | @meta.model_provider                                | event_attribute | string        | Provider of the model (e.g., openai, anthropic). Applies to: span.                                                                                                |
| Span Kind                             | @meta.span.kind                                     | event_attribute | string        | Kind of LLM span (e.g., llm, embedding, agent, workflow). Applies to: span.                                                                                       |
| UI Title                              | @meta.metadata.ui_title                             | event_attribute | string        | Human-readable title used for UI rendering (e.g., 'Generated widget'). Applies to: span.                                                                          |
| UI Content                            | @meta.metadata.ui_content                           | event_attribute | string        | UI-renderable content associated with this span (e.g., Data source: `metrics` Query: `sum:dd.services.pods{*}`). Applies to: span.                                |
| Input Value                           | @meta.input.value                                   | event_attribute | string        | Primary input value to the LLM or evaluation (e.g., 'Hello. I need help'). Applies to: span.                                                                      |
| Output Value                          | @meta.output.value                                  | event_attribute | string        | Primary output value from the LLM or evaluation (e.g., Hello! How can I help...). Applies to: span.                                                               |
| Prompt ID                             | @meta.input.prompt.id                               | event_attribute | string        | Identifier of the prompt used (e.g., generate_answer_prompt). Applies to: span.                                                                                   |
| Prompt Template                       | @meta.input.prompt.template                         | event_attribute | string        | Prompt template string used for rendering the final prompt. Applies to: span.                                                                                     |
| Prompt Template ID                    | @meta.input.prompt.template_id                      | event_attribute | string        | Stable identifier/hash for the template. Applies to: span.                                                                                                        |
| Prompt Version ID                     | @meta.input.prompt.version_id                       | event_attribute | string        | Version identifier/hash for the prompt instance. Applies to: span.                                                                                                |
| Evaluation Source Type                | @eval_source_type                                   | event_attribute | string        | Source of the evaluation (e.g., external). Applies to: evaluation.                                                                                                |
| Evaluation Metric Type                | @eval_metric_type                                   | event_attribute | string        | Type of evaluation metric used (e.g., 'score', 'categorical'). Applies to: evaluation.                                                                            |
| Score Value                           | @score_value                                        | event_attribute | float64       | Numeric score value for 'score' metrics (float; e.g., 0.79). Applies to: evaluation.                                                                              |
| Failure to Answer Error Message       | @evaluation.managed.failure_to_answer.error.message | event_attribute | string        | Error message emitted by the managed 'failure_to_answer' evaluation (e.g., 'Not a root span - This happens in case the eval is configured...'). Applies to: span. |
| Failure to Answer Error Type          | @evaluation.managed.failure_to_answer.error.type    | event_attribute | string        | Error type emitted by the managed 'failure_to_answer' evaluation (e.g., Azure OpenAI - Received HTTP 429). Applies to: span.                                      |
| Failure to Answer Status              | @evaluation.managed.failure_to_answer.status        | event_attribute | string        | Status emitted by the managed 'failure_to_answer' evaluation (e.g., WARN). Applies to: span.                                                                      |
| Goal Completeness Error Message       | @evaluation.managed.goal_completeness.error.message | event_attribute | string        | Error message emitted by the managed 'goal_completeness' evaluation. Applies to: span.                                                                            |
| Goal Completeness Error Type          | @evaluation.managed.goal_completeness.error.type    | event_attribute | string        | Error type emitted by the managed 'goal_completeness' evaluation. Applies to: span.                                                                               |
| Goal Completeness Status              | @evaluation.managed.goal_completeness.status        | event_attribute | string        | Status emitted by the managed 'goal_completeness' evaluation. Applies to: span.                                                                                   |
| Hallucination Status                  | @evaluation.managed.hallucination.status            | event_attribute | string        | Status emitted by the managed hallucination/faithfulness evaluation. Applies to: span.                                                                            |
| Hallucination Value                   | @evaluation.managed.hallucination.value             | event_attribute | string        | Value emitted by the managed hallucination/faithfulness evaluation (e.g., 'hallucination found'). Applies to: span.                                               |
| Hallucination Score Value             | @evaluation.managed.hallucination.score_value       | event_attribute | float64       | Score emitted by the managed hallucination/faithfulness evaluation. Applies to: span.                                                                             |
| Hallucination Eval Metric Type        | @evaluation.managed.hallucination.eval_metric_type  | event_attribute | string        | Metric type emitted by the managed hallucination/faithfulness evaluation (e.g., categorical). Applies to: span.                                                   |
| Hallucination Categorical Value       | @evaluation.managed.hallucination.categorical_value | event_attribute | string        | Categorical value emitted by the managed hallucination/faithfulness evaluation. Applies to: span.                                                                 |
| External Harmfulness Eval Metric Type | @evaluation.external.harmfulness.eval_metric_type   | event_attribute | string        | Metric type for an external evaluation (e.g., score). Applies to: span.                                                                                           |
| External Harmfulness Value            | @evaluation.external.harmfulness.value              | event_attribute | float64       | Value for an external evaluation (often equals score_value for score types). Applies to: span.                                                                    |
| Evaluation Metric ID                  | @id                                                 | event_attribute | string        | ID for the evaluation record (UUID). Applies to: evaluation.                                                                                                      |
| Session ID                            | @session_id                                         | event_attribute | string        | Identifier used to correlate spans belonging to the same user or agent session. Applies to: span.                                                                 |
| Billing Usage Attribution Tags        | billing_header.usage_attribution_tags               | core            | array<string> | Tags used for cost attribution and billing purposes (e.g., ml_app:curated_dataset). Applies to: span.                                                             |
| Billable                              | billing_header.billable                             | core            | bool          | Whether the event is billable. Applies to: span.                                                                                                                  |
| Quality Evaluation Results            | @meta.evaluations.quality                           | event_attribute | array<string> | List of quality evaluations that matched content in this span (e.g., ['Hallucination']). Applies to: span.                                                        |
| Security Evaluation Results           | @meta.evaluations.security                          | event_attribute | array<string> | List of security scanners that matched content in this span (e.g., ['Standard Email Address','Standard Email Address']). Applies to: span.                        |
| Evaluation Timestamp                  | @eval_timestamp                                     | event_attribute | int64         | Timestamp (in milliseconds since epoch) at which the evaluation was executed (e.g., 1766403898176). Applies to: evaluation.                                       |
| Metric Type                           | @metric_type                                        | event_attribute | string        | Metric value type for the evaluation (e.g., 'score', 'boolean'). Applies to: evaluation.                                                                          |
| Error Message                         | @error.message                                      | event_attribute | string        | Error message for failed evaluations (e.g., "'NoneType' object has no attribute 'response'"). Applies to: evaluation.                                             |
| Error Type                            | @error.type                                         | event_attribute | string        | Error type for failed evaluations (e.g., 'AttributeError'). Applies to: evaluation.                                                                               |
| Error Stack Trace                     | @error.stack                                        | event_attribute | string        | Stack trace for failed evaluations. Applies to: evaluation.                                                                                                       |
| Experiment ID                         | experiment_id                                       | core            | string        | Identifier of the experiment associated with this evaluation metric. Applies to: span, evaluation.                                                                |
| Dataset ID                            | dataset_id                                          | core            | string        | Identifier of the dataset associated with this evaluation metric. Applies to: evaluation.                                                                         |
| Project ID                            | project_id                                          | core            | string        | Identifier of the project associated with this evaluation metric (typically from tags). Applies to: evaluation.                                                   |
| Event ID                              | id                                                  | core            | string        | A unique identifier for the event.                                                                                                                                |
| Discovery Timestamp                   | discovery_timestamp                                 | core            | int64         | The time when Datadog first received the event (milliseconds since Unix epoch). May differ from timestamp if there was an ingestion delay.                        |
| Tiebreaker                            | tiebreaker                                          | core            | int64         | A value used to establish deterministic ordering among events that share the same timestamp.                                                                      |
| Ingest Size                           | ingest_size_in_bytes                                | core            | int64         | The size of the event payload in bytes at the time of ingestion, before any processing.                                                                           |
| Random Draw                           | random_draw                                         | core            | float64       | A random value between 0.0 and 1.0 assigned at ingestion, useful for consistent sampling across queries.                                                          |
