---
title: Cost
description: Monitor your LLM tokens and costs.
breadcrumbs: Docs > Agent Observability > Monitoring > Cost
---

# Cost

{% callout %}
# Important note for users on the following Datadog sites: app.ddog-gov.com, us2.ddog-gov.com

{% alert level="danger" %}
This product is not supported for your selected [Datadog site](https://docs.datadoghq.com/getting_started/site.md). ({% placeholder "user-datadog-site-name" /%}).
{% /alert %}

{% /callout %}

{% image
   source="https://docs.dd-static.net/images/llm_observability/Cost_LLMO.642ee5a44d5f9d78be42ca7c253848bc.png?auto=format&fit=max&w=850 1x, https://docs.dd-static.net/images/llm_observability/Cost_LLMO.642ee5a44d5f9d78be42ca7c253848bc.png?auto=format&fit=max&w=850&dpr=2 2x"
   alt="Cost view for an app in Agent Observability." /%}

Agent Observability automatically calculates an estimated cost for each LLM request, using providers' public pricing models and token counts annotated on LLM/embedding spans.

By aggregating this information across traces and applications, you can gain insights into the user patterns of your LLM models and their impact on overall spending.

Use cases:

- See and understand where LLM spend is coming from, at the model, request, and application levels
- Track changes in token usage and cost over time to proactively guard against higher bills in the future
- Correlate LLM cost with overall application performance, model versions, model providers, and prompt details in a single view

## Setting up cost monitoring{% #setting-up-cost-monitoring %}

Datadog provides two ways to monitor your LLM costs:

- Automatic: Use supported LLM providers' public pricing rates
- Manual: For custom pricing rates, self-hosted models, or unsupported providers, manually supply your own cost values.

### Automatic{% #automatic %}

If your LLM requests involve any of the listed supported providers, Datadog automatically calculates the cost of each request based on the following:

- Token counts attached to the LLM/embedding span, provided by either [auto-instrumentation](https://docs.datadoghq.com/llm_observability/instrumentation/auto_instrumentation.md) or manual user annotation
- The model provider's public pricing rates

### Manual{% #manual %}

To manually supply cost information, follow the instrumentation steps described in the [SDK Reference](https://docs.datadoghq.com/llm_observability/instrumentation/sdk.md?tab=python#monitoring-costs) or [API](https://docs.datadoghq.com/llm_observability/instrumentation/api.md#metrics). When setting your costs up manually (e.g. setting `total_cost`), the unit must be in dollar units; however, the unit will be stored as nanodollars.

{% alert level="info" %}
If you provide partial cost information, Datadog tries to estimate missing information. For example, if you supply a total cost but not input/output cost values, Datadog uses provider pricing and token values annotated on your span to compute the input/output cost values. This can cause a mismatch between your manually provided total cost and the sum of Datadog's computed input/output costs. Datadog always displays your provided total cost as-is, even if these values differ.
{% /alert %}

#### How token counts are calculated{% #how-token-counts-are-calculated %}

The following relationships apply to token usage fields:

{% image
   source="https://docs.dd-static.net/images/llm_observability/llm_token_relationships.e96c861b2cbf856a124bc386db569795.png?auto=format&fit=max&w=850 1x, https://docs.dd-static.net/images/llm_observability/llm_token_relationships.e96c861b2cbf856a124bc386db569795.png?auto=format&fit=max&w=850&dpr=2 2x"
   alt="Diagram showing how Agent Observability computes token counts: total_tokens is the sum of input_tokens and output_tokens; input_tokens is the sum of non_cached_input_tokens, cache_read_input_tokens, and cache_write_input_tokens; reasoning_output_tokens is a subset of output_tokens." /%}



When all sub-components of a token field are provided, Datadog automatically calculates the parent token field, so it does not need to be sent separately.

For example:

- `input_tokens` can be inferred from `non_cached_input_tokens`, `cache_read_input_tokens`, and `cache_write_input_tokens`
- `total_tokens` can be inferred from input_tokens and output_tokens

Datadog recommends providing the full cache token breakdown whenever cache usage information is available. Datadog automatically applies provider-specific cache read and cache write pricing rates when calculating cached input costs.

If only `input_tokens` is provided, Datadog treats all input tokens as non-cached input tokens. Providing only a partial cache breakdown may result in inaccurate token counts or cost discrepancies.

## Supported providers{% #supported-providers %}

Datadog automatically calculates the cost of LLM requests made to the following supported providers using the publicly available pricing information from their official documentation.

{% alert level="info" %}
Datadog only supports monitoring costs for text-based models.
{% /alert %}

Datadog supports estimated costs for [800+ models](https://github.com/pydantic/genai-prices/tree/main?tab=readme-ov-file#providers), from OpenAI, Hugging Face, Gemini, Anthropic to models served by OpenRouter.

## Metrics{% #metrics %}

You can find cost metrics in [Agent Observability Metrics](https://docs.datadoghq.com/llm_observability/monitoring/metrics.md#llm-cost-metrics). The unit for Agent Observability estimated cost metrics is **nanodollars**.

The cost metrics include a `source` tag to indicate where the value originated:

- `source:auto` — automatically calculated
- `source:user` — manually provided

## Custom tags on cost and tokens metrics{% #custom-tags-on-cost-and-tokens-metrics %}

Datadog's LLM cost and token metrics ship with OOTB tags such as `model_name`, `model_provider`, and `ml_app`. To break down LLM spend by attributes specific to your application — for example team, customer tier or feature, you can promote a subset of your span tags as custom tags on the generated cost and token metrics.

{% alert level="info" %}
Custom tags apply only to LLM **cost** (`ml_obs.span.* cost`) and **tokens** (`ml_obs.span.* tokens`) metrics. Other span metrics (such as `ml_obs.span.duration`) are not affected.
{% /alert %}

### How it works{% #how-it-works %}

1. Tag your LLM spans with the attributes you want to break spend down by, using the SDK's `tags` parameter (or auto-instrumentation)
1. In the same annotation, mark which of those tag keys should propagate to cost metrics by passing the keys to the `cost_tags` (Python) or `costTags` (Node.js) parameter. See the [SDK Reference](https://docs.datadoghq.com/llm_observability/instrumentation/sdk.md#adding-custom-tags-to-cost-and-tokens-metrics) for the full instrumentation syntax and timing rules
1. Once propagated, those tag keys appear as tags on the LLM cost and token metrics. You can use them anywhere in Datadog that consumes these metrics — dashboards, monitors, notebooks, and trace search

{% alert level="warning" %}
Only configure tags with bounded values (such as `team`, `customer_tier`, or `feature`). Tags with unbounded or high-cardinality values (such as user IDs or request IDs) are not fully supported and may be truncated or omitted from the resulting metrics.
{% /alert %}

### Use case: Filter and group spend by an application attribute{% #use-case-filter-and-group-spend-by-an-application-attribute %}

Use custom tags to break down spend by attributes that aren't part of the OOTB metric tags. Build a dashboard widget of `ml_obs.span.llm.total.cost` grouped by `team` to see which teams' LLM usage is driving consumption over time, and configure a metric monitor on the same query to alert independently for each team (or customer tier, feature, environment) when usage crosses a threshold.

### Use case: Track prompt caching effectiveness{% #use-case-track-prompt-caching-effectiveness %}

Many LLM providers cache reused prompt prefixes to reduce repeated processing. Tagging spans with a prompt identifiers such as "prompt_version" or "prompt_template_id," which are then broken down by theese tags, allows you to:

- Compare prompt variants by how effectively they reuse provider caching. For example, tracking the cache hit rate using the formula `ml_obs.span.llm.input.cache_read.tokens / ml_obs.span.llm.input.cache_write.tokens`
- Identify prompts that populate the cache but rarely reuse it, which is usually a sign of unstable prefixes or aggressive invalidation
- Spot prompts that miss caching entirely because the prefix is too short to qualify or changes with every call.
- Alert on caching regressions in token metrics before they result in increased costs

{% image
   source="https://docs.dd-static.net/images/llm_observability/cost_tags_cache.e3917451a2e817a7435e5477c1764b35.png?auto=format&fit=max&w=850 1x, https://docs.dd-static.net/images/llm_observability/cost_tags_cache.e3917451a2e817a7435e5477c1764b35.png?auto=format&fit=max&w=850&dpr=2 2x"
   alt="Dashboard comparing cache read, cache write, and non-cached input tokens grouped by a custom prompt_version tag." /%}

## View costs in Agent Observability{% #view-costs-in-agent-observability %}

View your app in Agent Observability and select Cost on the left. The *Cost view* features:

- A high-level overview of your LLM usage over time including Total Cost, Cost Change, Total Tokens, and Token Change
- Breakdown by Token Type: A breakdown of token usage, along with associated costs
- Breakdown by Provider/Model or Prompt ID/Version: Cost and token usage broken down by LLM provider and model, or by individual prompts or prompt versions (powered by [Prompt Tracking](https://docs.datadoghq.com/llm_observability/monitoring/prompt_tracking.md))
- Most Expensive LLM Calls: A list of your most expensive requests

{% image
   source="https://docs.dd-static.net/images/llm_observability/cost_tracking_trace.4e2dbd697bee43eca51ea91d3e35a5ba.png?auto=format&fit=max&w=850 1x, https://docs.dd-static.net/images/llm_observability/cost_tracking_trace.4e2dbd697bee43eca51ea91d3e35a5ba.png?auto=format&fit=max&w=850&dpr=2 2x"
   alt="Cost data in trace detail." /%}

Cost data is also available within your application's traces and spans, allowing you to understand cost at both the request (trace) and operation level (span). Click any trace or span to open a detailed side-panel view that includes cost metrics for the full trace and for each individual LLM call. At the top of the trace view, the banner shows aggregated cost information for the full trace, including estimated cost and total tokens. Hovering over these values reveals a breakdown of input and output token/costs.

Selecting an individual LLM span shows similar cost metrics specific to that LLM request.

To query cost-related data in Traces page, use the left side Cost facets.

Alternatively, query the following span attributes directly:

- `@metrics.input_tokens` / `@metrics.estimated_input_cost`
- `@metrics.output_tokens` / `@metrics.estimated_output_cost`
- `@metrics.total_tokens` / `@metrics.estimated_total_cost`
- `@metrics.non_cached_input_tokens` / `@metrics.estimated_non_cached_input_cost`
- `@metrics.cache_read_input_tokens` / `@metrics.estimated_cache_read_input_cost`
- `@metrics.cache_write_input_tokens` / `@metrics.estimated_cache_write_input_cost`
- `@metrics.reasoning_output_tokens` / `@metrics.estimated_reasoning_output_cost`

## Troubleshoot LLM cost monitoring{% #troubleshoot-llm-cost-monitoring %}

If LLM cost values are missing from a trace or appear inaccurate, see the following sections for common causes and fixes.

### Trace shows "PARTIAL COST" or "COST UNAVAILABLE"{% #trace-shows-partial-cost-or-cost-unavailable %}

Each LLM span in a trace is priced independently. The warning appears when one or more LLM spans in the trace are missing a cost estimate:

- PARTIAL COST: Datadog computed costs for some LLM spans in the trace, but not the rest. The Estimated Cost value reflects the sum of only the spans for which cost was computed.
- COST UNAVAILABLE: Datadog could not compute cost for any LLM span in the trace.

Hover over the warning to see the number of affected spans, the reason cost is missing, and the names of the affected spans. The reasons fall into one of the following categories.

#### Unsupported model provider{% #unsupported-model-provider %}

The LLM model provider on the span is not in the list of supported providers, so Datadog has no public pricing rates to apply.

How to fix:

- Verify the span's `model_provider` matches a supported provider identifier. See the list of supported providers.
- Manually supply cost values on the span. See the [SDK Reference](https://docs.datadoghq.com/llm_observability/instrumentation/sdk.md?tab=python#monitoring-costs) or [API](https://docs.datadoghq.com/llm_observability/instrumentation/api.md#metrics).

#### Unsupported model name{% #unsupported-model-name %}

The LLM provider is supported, but the `model_name` on the span does not match any entry in the provider's [pricing catalog](https://github.com/pydantic/genai-prices/tree/main?tab=readme-ov-file#providers). This usually happens when the model name is misspelled, abbreviated, or uses an internal alias.

How to fix:

- Update the span's `model_name` to match the official name in the provider's [pricing catalog](https://github.com/pydantic/genai-prices/tree/main?tab=readme-ov-file#providers).
- For privately deployed or customized variants of a supported model, manually supply cost values on the span. See the [SDK Reference](https://docs.datadoghq.com/llm_observability/instrumentation/sdk.md?tab=python#monitoring-costs) or [API](https://docs.datadoghq.com/llm_observability/instrumentation/api.md#metrics).

#### Missing token counts{% #missing-token-counts %}

The span has no `input_tokens` or `output_tokens` annotation, so Datadog has no token values for cost calculation. (For embedding spans, only `input_tokens` is required.)

How to fix:

- For [auto-instrumentation](https://docs.datadoghq.com/llm_observability/instrumentation/auto_instrumentation.md), confirm your provider and SDK version are supported.
- For manual instrumentation, annotate input and output token counts on the span. See the [SDK Reference](https://docs.datadoghq.com/llm_observability/instrumentation/sdk.md?tab=python#monitoring-costs) or [API](https://docs.datadoghq.com/llm_observability/instrumentation/api.md#metrics).

### Inaccurate cost when prompt caching is enabled{% #inaccurate-cost-when-prompt-caching-is-enabled %}

If a span includes only aggregate `input_tokens` and `output_tokens`, but is missing `cache_read_input_tokens`, `cache_write_input_tokens`, or `non_cached_input_tokens`, Datadog applies the standard input rate to the entire `input_tokens`.

For providers that support prompt caching, cache reads, cache writes, and non-cached inputs are billed at different rates. Without this breakdown, the input cost calculation may be inaccurate.

How to fix:

- For [auto-instrumentation](https://docs.datadoghq.com/llm_observability/instrumentation/auto_instrumentation.md), verify that your provider and SDK version emit cache-specific token counts.
- For manual instrumentation, annotate `cache_read_input_tokens`, `cache_write_input_tokens`, and `non_cached_input_tokens` in addition to `input_tokens`. See the [SDK Reference](https://docs.datadoghq.com/llm_observability/instrumentation/sdk.md?tab=python#monitoring-costs) or [API](https://docs.datadoghq.com/llm_observability/instrumentation/api.md#metrics).
