---
title: HTTP API Reference
description: Datadog, the leading service for cloud-scale monitoring.
breadcrumbs: >-
  Docs > LLM Observability > LLM Observability Instrumentation > HTTP API
  Reference
---

# HTTP API Reference

{% callout %}
# Important note for users on the following Datadog sites: app.ddog-gov.com

{% alert level="danger" %}
This product is not supported for your selected [Datadog site](https://docs.datadoghq.com/getting_started/site.md). ().
{% /alert %}

{% /callout %}

## Overview{% #overview %}

The LLM Observability HTTP API provides an interface for developers to send LLM-related traces and spans to Datadog. If your application is written in Python, Node.js, or Java, you can use the [LLM Observability SDKs](https://docs.datadoghq.com/llm_observability/setup/sdk.md).

The API accepts spans with timestamps no more than 24 hours old, allowing limited backfill of delayed data.

## Spans API{% #spans-api %}

Use this endpoint to send spans to Datadog. For details on the available kinds of spans, see [Span Kinds](https://docs.datadoghq.com/llm_observability/terms.md).

{% dl %}

{% dt %}
Endpoint
{% /dt %}

{% dd %}
`https://api.  /api/intake/llm-obs/v1/trace/spans`
{% /dd %}

{% dt %}
Method
{% /dt %}

{% dd %}
`POST`
{% /dd %}

{% /dl %}

### Request{% #request %}

#### Headers (required){% #headers-required %}

- `DD-API-KEY=<YOUR_DATADOG_API_KEY>`
- `Content-Type="application/json"`

#### Body data (required){% #body-data-required %}

{% tab title="Model" %}

| Field             | Type             | Description                        |
| ----------------- | ---------------- | ---------------------------------- |
| data [*required*] | SpansRequestData | Entry point into the request body. |

{% /tab %}

{% tab title="Example" %}

```json
{
  "data": {
    "type": "span",
    "attributes": {
      "ml_app": "weather-bot",
      "session_id": "1",
      "tags": [
        "service:weather-bot",
        "env:staging",
        "user_handle:example-user@example.com",
        "user_id:1234"
      ],
      "spans": [
        {
          "parent_id": "undefined",
          "trace_id": "<TEST_TRACE_ID>",
          "span_id": "<AGENT_SPAN_ID>",
          "name": "health_coach_agent",
          "meta": {
            "kind": "agent",
            "input": {
              "value": "What is the weather like today and do i wear a jacket?"
            },
            "output": {
              "value": "It's very hot and sunny, there is no need for a jacket"
            }
          },
          "start_ns": 1713889389104152000,
          "duration": 10000000000
        },
        {
          "parent_id": "<AGENT_SPAN_ID>",
          "trace_id": "<TEST_TRACE_ID>",
          "span_id": "<WORKFLOW_ID>",
          "name": "qa_workflow",
          "meta": {
            "kind": "workflow",
            "input": {
              "value": "What is the weather like today and do i wear a jacket?"
            },
            "output": {
              "value":  "It's very hot and sunny, there is no need for a jacket"
            }
          },
          "start_ns": 1713889389104152000,
          "duration": 5000000000
        },
        {
          "parent_id": "<WORKFLOW_SPAN_ID>",
          "trace_id": "<TEST_TRACE_ID>",
          "span_id": "<LLM_SPAN_ID>",
          "name": "generate_response",
          "meta": {
            "kind": "llm",
            "input": {
              "messages": [
                {
                  "role": "system",
                  "content": "Your role is to ..."
                },
                {
                  "role": "user",
                  "content": "What is the weather like today and do i wear a jacket?"
                }
              ]
            },
            "output": {
              "messages": [
                {
                  "content": "It's very hot and sunny, there is no need for a jacket",
                  "role": "assistant"
                }
              ]
            }
          },
          "start_ns": 1713889389104152000,
          "duration": 2000000000
        }
      ]
    }
  }
}
```

{% /tab %}

### Response{% #response %}

If the request is successful, the API responds with a 202 network code and an empty body.

### API standards{% #api-standards %}

#### Error{% #error %}

| Field   | Type   | Description        |
| ------- | ------ | ------------------ |
| message | string | The error message. |
| stack   | string | The stack trace.   |
| type    | string | The error type.    |

#### IO{% #io %}

| Field      | Type                      | Description                                                                                                                                     |
| ---------- | ------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------- |
| value      | string                    | Input or output value. If not set, this value is inferred from messages or documents.                                                           |
| messages   | [Message]                 | List of messages. Use only for LLM spans.                                                                                                       |
| documents  | [Document]                | List of documents. Use only as output for retrieval spans.                                                                                      |
| prompt     | Prompt                    | Structured prompt metadata that includes the template and variables used for the LLM input. This should only be used for input IO on LLM spans. |
| embedding  | [float]                   | List of embedding values.                                                                                                                       |
| parameters | Dict[key (string), value] | Additional parameters for the input or output.                                                                                                  |

**Note**: When only `input.messages` is set for an LLM span, Datadog infers `input.value` from `input.messages` and uses the following inference logic:

1. If a message with `role=user` exists, the content of the last message is used as `input.value`.
1. If a `user` role message is not present, `input.value` is inferred by concatenating the content fields of all messages, regardless of their roles.

#### Message{% #message %}

| Field                | Type         | Description                                     |
| -------------------- | ------------ | ----------------------------------------------- |
| content [*required*] | string       | The body of the message.                        |
| role                 | string       | The role of the entity.                         |
| tool_calls           | [ToolCall]   | List of tool calls made in this message.        |
| tool_results         | [ToolResult] | List of tool execution results in this message. |

#### Document{% #document %}

| Field    | Type                      | Description                              |
| -------- | ------------------------- | ---------------------------------------- |
| text     | string                    | The text of the document.                |
| name     | string                    | The name of the document.                |
| score    | float                     | The score associated with this document. |
| id       | string                    | The id of this document.                 |
| ranking  | integer                   | The ranking of this document.            |
| metadata | Dict[key (string), value] | Additional metadata for this document.   |

#### ToolCall{% #toolcall %}

| Field     | Type                      | Description                           |
| --------- | ------------------------- | ------------------------------------- |
| name      | string                    | The name of the tool being called.    |
| arguments | Dict[key (string), value] | The arguments passed to the tool.     |
| tool_id   | string                    | Unique identifier for this tool call. |
| type      | string                    | The type of tool call.                |

#### ToolResult{% #toolresult %}

| Field   | Type   | Description                                             |
| ------- | ------ | ------------------------------------------------------- |
| name    | string | The name of the tool that was called.                   |
| result  | string | The result returned by the tool.                        |
| tool_id | string | Unique identifier matching the corresponding tool call. |
| type    | string | The type of tool result.                                |

#### ToolDefinition{% #tooldefinition %}

| Field       | Type                      | Description                                |
| ----------- | ------------------------- | ------------------------------------------ |
| name        | string                    | The name of the tool.                      |
| description | string                    | A description of what the tool does.       |
| schema      | Dict[key (string), value] | The schema defining the tool's parameters. |

#### SpanField{% #spanfield %}

| Field | Type   | Description             |
| ----- | ------ | ----------------------- |
| kind  | string | The kind of span field. |

#### Prompt{% #prompt %}

{% alert level="info" %}
LLM Observability registers new versions of templates when the `template` or `chat_template` value is updated. If the input is expected to change between invocations, extract the dynamic parts into a variable.
{% /alert %}

{% tab title="Model" %}

| Field                 | Type                       | Description                                                                                                                                                          |
| --------------------- | -------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| id                    | string                     | Logical identifier for this prompt template. Should be unique per `ml_app`.                                                                                          |
| name                  | string                     | Human-readable name for the prompt.                                                                                                                                  |
| version               | string                     | Version tag for the prompt (for example, "1.0.0"). If not provided, LLM Observability automatically generates a version by computing a hash of the template content. |
| template              | string                     | Single string template form. Use placeholder syntax (like `{{variable_name}}`) to embed variables. This should not be set with `chat_template`.                      |
| chat_template         | [Message]                  | Multi-message template form. Use placeholder syntax (like `{{variable_name}}`) to embed variables in message content. This should not be set with `template`.        |
| variables             | Dict[key (string), string] | Variables used to render the template. Keys correspond to placeholder names in the template.                                                                         |
| query_variable_keys   | [string]                   | Variable keys that contain the user query. Used for hallucination detection.                                                                                         |
| context_variable_keys | [string]                   | Variable keys that contain ground-truth or context content. Used for hallucination detection.                                                                        |
| tags                  | Dict[key (string), string] | Tags to attach to the prompt run.                                                                                                                                    |

{% /tab %}

{% tab title="Example" %}

```json
{
  "id": "translation-prompt",
  "chat_template": [
    {
      "role": "system",
      "content": "You are a translation service. You translate to {{language}}."
    }, {
      "role": "user",
      "content": "{{user_input}}"
    }
  ],
  "variables": {
    "language": "french",
    "user_input": "<USER_INPUT_TEXT>"
  }
}
```

{% /tab %}

#### Meta{% #meta %}

| Field                    | Type                                                                  | Description                                                                                                                                                              |
| ------------------------ | --------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| kind [*required*]        | string                                                                | The [span kind](https://docs.datadoghq.com/llm_observability/terms.md): `"agent"`, `"workflow"`, `"llm"`, `"tool"`, `"task"`, `"embedding"`, or `"retrieval"`.           |
| error                    | Error                                                                 | Error information on the span.                                                                                                                                           |
| input                    | IO                                                                    | The span's input information.                                                                                                                                            |
| output                   | IO                                                                    | The span's output information.                                                                                                                                           |
| metadata                 | Dict[key (string), value] where the value is a float, bool, or string | Data about the span that is not input or output related. Use the following metadata keys for LLM spans: `temperature`, `max_tokens`, `model_name`, and `model_provider`. |
| model_name               | string                                                                | The name of the model used for LLM spans.                                                                                                                                |
| model_provider           | string                                                                | The provider of the model used for LLM spans.                                                                                                                            |
| model_version            | string                                                                | The version of the model used for LLM spans.                                                                                                                             |
| embedding_for_prompt_idx | integer                                                               | The prompt index for which embeddings were computed.                                                                                                                     |
| span                     | SpanField                                                             | Span field information.                                                                                                                                                  |
| tool_definitions         | [ToolDefinition]                                                      | List of available tool definitions.                                                                                                                                      |
| expected_output          | IO                                                                    | The expected output information.                                                                                                                                         |
| intent                   | string                                                                | The intent of the span.                                                                                                                                                  |

#### Metrics{% #metrics %}

A dictionary of metrics to collect for the span. The keys are metric names (strings) and values are metric values (float64 pointers). Common metrics include:

- `input_tokens` - The number of input tokens (LLM spans)
- `output_tokens` - The number of output tokens (LLM spans)
- `total_tokens` - The total number of tokens (LLM spans)
- `non_cached_input_tokens` - The number of non-cached input tokens (LLM spans)
- `cache_read_input_tokens` - The number of cache read input tokens (LLM spans)
- `cache_write_input_tokens` - The number of cache write input tokens (LLM spans)
- `reasoning_output_tokens` - The number of reasoning tokens (LLM spans)
- `time_to_first_token` - Time in seconds for first output token (streaming LLM, root spans)
- `time_per_output_token` - Time in seconds per output token (streaming LLM, root spans)
- `input_cost` - Input cost in dollars (LLM and embedding spans)
- `output_cost` - Output cost in dollars (LLM spans)
- `total_cost` - Total cost in dollars (LLM spans)
- `non_cached_input_cost` - Non-cached input cost in dollars (LLM spans)
- `cache_read_input_cost` - Cache read input cost in dollars (LLM spans)
- `cache_write_input_cost` - Cache write input cost in dollars (LLM spans)
- `reasoning_output_cost`- Reasoning output cost in dollars (LLM spans)

Type: `Dict[key (string), float64]`

#### Span{% #span %}

| Field                  | Type                        | Description                                                                                      |
| ---------------------- | --------------------------- | ------------------------------------------------------------------------------------------------ |
| name [*required*]      | string                      | The name of the span.                                                                            |
| span_id [*required*]   | string                      | An ID unique to the span.                                                                        |
| trace_id [*required*]  | string                      | A unique ID shared by all spans in the same trace.                                               |
| parent_id [*required*] | string                      | ID of the span's direct parent. If the span is a root span, the `parent_id` must be `undefined`. |
| start_ns [*required*]  | uint64                      | The span's start time in nanoseconds.                                                            |
| duration [*required*]  | float64                     | The span's duration in nanoseconds.                                                              |
| meta [*required*]      | Meta                        | The core content relative to the span.                                                           |
| status                 | string                      | Error status (`"ok"` or `"error"`). Defaults to `"ok"`.                                          |
| apm_trace_id           | string                      | The ID of the associated APM trace. Defaults to match the `trace_id` field.                      |
| metrics                | Dict[key (string), float64] | Datadog metrics to collect. See Metrics for common metric names.                                 |
| session_id             | string                      | The span's `session_id`. Overrides the top-level `session_id` field.                             |
| tags                   | [Tag]                       | A list of tags to apply to this particular span.                                                 |
| service                | string                      | The service name.                                                                                |
| ml_app                 | string                      | The LLM application name for this span. Overrides the top-level `ml_app` field.                  |

#### SpansRequestData{% #spansrequestdata %}

| Field                   | Type         | Description                                |
| ----------------------- | ------------ | ------------------------------------------ |
| type [*required*]       | string       | Identifier for the request. Set to `span`. |
| attributes [*required*] | SpansPayload | The body of the request.                   |

#### SpansPayload{% #spanspayload %}

| Field               | Type   | Description                                                                                     |
| ------------------- | ------ | ----------------------------------------------------------------------------------------------- |
| ml_app [*required*] | string | The name of your LLM application. See Application naming guidelines.                            |
| spans [*required*]  | [Span] | A list of spans.                                                                                |
| tags                | [Tag]  | A list of top-level tags to apply to each span.                                                 |
| session_id          | string | The session the list of spans belongs to. Can be overridden or set on individual spans as well. |

#### Tag{% #tag %}

Tags should be formatted as a list of strings (for example, `["user_handle:dog@gmail.com", "app_version:1.0.0"]`). They are meant to store contextual information surrounding the span.

For more information about tags, see [Getting Started with Tags](https://docs.datadoghq.com/getting_started/tagging.md).

#### Application naming guidelines{% #application-naming-guidelines %}

Your application name (the value of `DD_LLMOBS_ML_APP`) must be a lowercase Unicode string. It may contain the characters listed below:

- Alphanumerics
- Underscores
- Minuses
- Colons
- Periods
- Slashes

The name can be up to 193 characters long and may not contain contiguous or trailing underscores.

## Evaluations API{% #evaluations-api %}

{% alert level="info" %}
For comprehensive examples and guidance on building custom evaluators, see the [Evaluation Developer Guide](https://docs.datadoghq.com/llm_observability/guide/evaluation_developer_guide.md).
{% /alert %}

Use this endpoint to send evaluations associated with a given span to Datadog.

{% dl %}

{% dt %}
Endpoint
{% /dt %}

{% dd %}
`https://api.  /api/intake/llm-obs/v2/eval-metric`
{% /dd %}

{% dt %}
Method
{% /dt %}

{% dd %}
`POST`
{% /dd %}

{% /dl %}

Evaluations must be joined to a unique span. You can identify the target span using either of these two methods:

1. Tag-based joining - Join an evaluation using a custom tag key-value pair that uniquely identifies a single span.
1. Direct span reference - Join an evaluation using the span's unique trace ID and span ID combination.

### Request{% #request-1 %}

#### Headers (required){% #headers-required-1 %}

- `DD-API-KEY=<YOUR_DATADOG_API_KEY>`
- `Content-Type="application/json"`

#### Body data (required){% #body-data-required-1 %}

{% tab title="Model" %}

| Field             | Type                   | Description                        |
| ----------------- | ---------------------- | ---------------------------------- |
| data [*required*] | EvalMetricsRequestData | Entry point into the request body. |

{% /tab %}

{% tab title="Example" %}

```json
{
  "data": {
    "type": "evaluation_metric",
    "attributes": {
      "metrics": [
        {
          "join_on": {
            "span": {
              "span_id": "20245611112024561111",
              "trace_id": "13932955089405749200"
            }
          },
          "ml_app": "weather-bot",
          "timestamp_ms": 1609459200,
          "metric_type": "categorical",
          "label": "Sentiment",
          "categorical_value": "Positive",
        },
        {
          "join_on": {
            "tag": {
              "key": "msg_id",
              "value": "1123132"
            }
          },
          "ml_app": "weather-bot",
          "timestamp_ms": 1609479200,
          "metric_type": "score",
          "label": "Accuracy",
          "score_value": 3,
          "assessment": "fail",
          "reasoning": "The response provided incorrect information about the weather forecast."
        },
        {
          "join_on": {
            "tag": {
              "key": "msg_id",
              "value": "1123132"
            }
          },
          "ml_app": "weather-bot",
          "timestamp_ms": 1609479200,
          "metric_type": "boolean",
          "label": "Topic Relevancy",
          "boolean_value": true,
        },
        {
          "join_on": {
            "tag": {
              "key": "msg_id",
              "value": "1123132"
            }
          },
          "ml_app": "weather-bot",
          "timestamp_ms": 1609479200,
          "metric_type": "json",
          "label": "Custom Evaluation",
          "json_value": {
            "verdict": "pass",
            "confidence": 0.95,
            "is_valid": true,
            "metrics": {
              "accuracy": 0.92,
              "precision": 0.88
            },
            "passed_checks": ["coherence", "relevance", "factuality"]
          }
        }
      ]
    }
  }
}
```

{% /tab %}

### Response{% #response-1 %}

{% tab title="Model" %}

| Field   | Type         | Description                              | Guaranteed |
| ------- | ------------ | ---------------------------------------- | ---------- |
| ID      | string       | Response UUID generated upon submission. | Yes        |
| metrics | [EvalMetric] | A list of evaluations.                   | Yes        |

{% /tab %}

{% tab title="Example" %}

```json
{
  "data": {
    "type": "evaluation_metric",
    "id": "456f4567-e89b-12d3-a456-426655440000",
    "attributes": {
      "metrics": [
        {
          "id": "d4f36434-f0cd-47fc-884d-6996cee26da4",
          "join_on": {
            "span": {
              "span_id": "20245611112024561111",
              "trace_id": "13932955089405749200"
            }
          },
          "ml_app": "weather-bot",
          "timestamp_ms": 1609459200,
          "metric_type": "categorical",
          "label": "Sentiment",
          "categorical_value": "Positive"
        },
        {
          "id": "cdfc4fc7-e2f6-4149-9c35-edc4bbf7b525",
          "join_on": {
            "tag": {
              "key": "msg_id",
              "value": "1123132"
            }
          },
          "span_id": "20245611112024561111",
          "trace_id": "13932955089405749200",
          "ml_app": "weather-bot",
          "timestamp_ms": 1609479200,
          "metric_type": "score",
          "label": "Accuracy",
          "score_value": 3,
          "assessment": "fail",
          "reasoning": "The response provided incorrect information about the weather forecast."
        },
        {
          "id": "haz3fc7-g3p2-1s37-8m12-ndk4hbf7a522",
          "join_on": {
            "tag": {
              "key": "msg_id",
              "value": "1123132"
            }
          },
          "span_id": "20245611112024561111",
          "trace_id": "13932955089405749200",
          "ml_app": "weather-bot",
          "timestamp_ms": 1609479200,
          "metric_type": "boolean",
          "label": "Topic Relevancy",
          "boolean_value": true,
        },
        {
          "id": "abc1234-h4i5-6j78-9k01-lmn2opq3rst4",
          "join_on": {
            "tag": {
              "key": "msg_id",
              "value": "1123132"
            }
          },
          "span_id": "20245611112024561111",
          "trace_id": "13932955089405749200",
          "ml_app": "weather-bot",
          "timestamp_ms": 1609479200,
          "metric_type": "json",
          "label": "Custom Evaluation",
          "json_value": {
            "verdict": "pass",
            "confidence": 0.95,
            "is_valid": true,
            "metrics": {
              "accuracy": 0.92,
              "precision": 0.88
            },
            "passed_checks": ["coherence", "relevance", "factuality"]
          }
        }
      ]
    }
  }
}
```

{% /tab %}

### API standards{% #api-standards-1 %}

#### Attributes{% #attributes %}

| Field                | Type         | Description                                                    |
| -------------------- | ------------ | -------------------------------------------------------------- |
| metrics [*required*] | [EvalMetric] | A list of evaluations each associated with a span.             |
| tags                 | [Tag]        | A list of tags to apply to all the evaluations in the payload. |

#### EvalMetric{% #evalmetric %}

| Field                                                               | Type                      | Description                                                                      |
| ------------------------------------------------------------------- | ------------------------- | -------------------------------------------------------------------------------- |
| ID                                                                  | string                    | Evaluation metric UUID (generated upon submission).                              |
| join_on [*required*]                                                | [JoinOn]                  | How the evaluation is joined to a span.                                          |
| timestamp_ms [*required*]                                           | int64                     | A UTC UNIX timestamp in milliseconds representing the time the request was sent. |
| ml_app [*required*]                                                 | string                    | The name of your LLM application. See Application naming guidelines.             |
| metric_type [*required*]                                            | string                    | The type of evaluation: `"categorical"`, `"score"`, `"boolean"`, or `"json"`.    |
| label [*required*]                                                  | string                    | The unique name or label for the provided evaluation .                           |
| categorical_value [*required if the metric\_type is "categorical"*] | string                    | A string representing the category that the evaluation belongs to.               |
| score_value [*required if the metric\_type is "score"*]             | number                    | A score value of the evaluation.                                                 |
| boolean_value [*required if the metric\_type is "boolean"*]         | boolean                   | A boolean value of the evaluation.                                               |
| json_value [*required if the metric\_type is "json"*]               | Dict[key (string), value] | A JSON object value of the evaluation.                                           |
| assessment                                                          | string                    | An assessment of this evaluation. Accepted values are `pass` and `fail`.         |
| reasoning                                                           | string                    | A text explanation of the evaluation result.                                     |
| tags                                                                | [Tag]                     | A list of tags to apply to this particular evaluation metric.                    |

#### JoinOn{% #joinon %}

| Field | Type          | Description                                                                              |
| ----- | ------------- | ---------------------------------------------------------------------------------------- |
| span  | [SpanContext] | Uniquely identifies the span associated with this evaluation using span ID & trace ID.   |
| tag   | [TagContext]  | Uniquely identifies the span associated with this evaluation using a tag key-value pair. |

#### SpanContext{% #spancontext %}

| Field                 | Type   | Description                                                                                                                                                                                                                                                  |
| --------------------- | ------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| span_id [*required*]  | string | The span ID of the span that this evaluation is associated with. Must be a decimal string (for example, `"20245611112024561111"`). If your instrumentation produces hexadecimal span IDs (such as OpenTelemetry), convert them to decimal before submitting. |
| trace_id [*required*] | string | The trace ID of the span that this evaluation is associated with. Must be a decimal string (for example, `"13932955089405749200"`) or a 32-character lowercase hexadecimal string for 128-bit trace IDs.                                                     |

#### TagContext{% #tagcontext %}

| Field              | Type   | Description                                                                                  |
| ------------------ | ------ | -------------------------------------------------------------------------------------------- |
| key [*required*]   | string | The tag key name. This must be the same key used when setting the tag on the span.           |
| value [*required*] | string | The tag value. This value must match exactly one span with the specified tag key/value pair. |

#### EvalMetricsRequestData{% #evalmetricsrequestdata %}

| Field                   | Type         | Description                                             |
| ----------------------- | ------------ | ------------------------------------------------------- |
| type [*required*]       | string       | Identifier for the request. Set to `evaluation_metric`. |
| attributes [*required*] | [Attributes] | The body of the request.                                |

## Further Reading{% #further-reading %}

- [Datadog LLM Observability natively supports OpenTelemetry GenAI Semantic Conventions](https://www.datadoghq.com/blog/llm-otel-semantic-convention)
- [Track, compare, and optimize your LLM prompts with Datadog LLM Observability](https://www.datadoghq.com/blog/llm-prompt-tracking)
