Trace an LLM Application

LLM Observability is not available in the US1-FED site.

Overview

Your application can submit data to LLM Observability in two ways: with LLM Observability’s Python SDK, or with the LLM Observability API.

Each request fulfilled by your application is represented as a trace on the LLM Observability traces page in Datadog:

An LLM Observability trace displaying each span of a request

If you’re new to LLM Observability traces, read the Core Concepts before proceeding to decide which instrumentation options best suit your application.

Instrument your LLM application

This guide uses the LLM Observability SDK for Python. If your application is not written in Python, you can complete the steps below with API requests instead of SDK function calls.

Datadog provides auto-instrumentation to capture LLM calls for specific LLM provider libraries. However, manually instrumenting your LLM application using the Python SDK can unlock even more of Datadog’s LLM Observability features.

To trace an LLM application:

  1. Install the LLM Observability SDK.
  2. Configure the SDK by providing the required environment variables in your application startup command, or programmatically in-code.
    • Remember to configure your Datadog API key, Datadog site, and ML app name.
  3. Create spans in your LLM application code to represent your application’s operations.
  4. Annotate your spans with input data, output data, metadata (such as temperature), metrics (such as input_tokens), and key-value tags (such as version:1.0.0).
  5. Optionally, add advanced tracing features, such as user sessions.
  6. Run your LLM application.
    • If you used the command-line setup method, the command to run your application should use ddtrace-run, as described in those instructions.
    • If you used the in-code setup method, run your application as you normally would.
  7. Explore the resulting traces on the LLM Observability traces page, and the resulting metrics on the out-of-the-box LLM Observability dashboard.

Creating spans

To create a span, the LLM Observability SDK provides two options:

  1. Decorators: Use ddtrace.llmobs.decorators.<SPAN_KIND>() as a decorator on the function you’d like to trace, replacing <SPAN_KIND> with the desired span kind.
  2. Inline: Use ddtrace.llmobs.LLMObs.<SPAN_KIND>() as a context manager to trace any inline code, replacing <SPAN_KIND> with the desired span kind.

The examples below create a workflow span.

from ddtrace.llmobs.decorators import workflow

@workflow
def process_message():
    ... # user application logic
    return
from ddtrace.llmobs import LLMObs

def process_message():
    with LLMObs.workflow() as span:
        ... # user application logic
    return

Nesting spans

Starting a new span before the current span is finished automatically traces a parent-child relationship between the two spans. The parent span represents the larger operation, while the child span represents a smaller nested sub-operation within it.

The examples below create a trace with two spans.

from ddtrace.llmobs.decorators import task, workflow

@workflow
def process_message():
    perform_preprocessing()
    ... # user application logic
    return

@task
def perform_preprocessing():
    ... # user application logic
    return
from ddtrace.llmobs import LLMObs

def process_message():
    with LLMObs.workflow(name="process_message") as workflow_span:
        with LLMObs.task(name="perform_preprocessing") as task_span:
            ... # user application logic
    return

Annotating spans

To add extra information to a span such as inputs, outputs, metadata, metrics, or tags, use the LLM Observability SDK’s LLMObs.annotate() method.

The examples below annotate the workflow span created in the above example:

from ddtrace.llmobs import LLMObs
from ddtrace.llmobs.decorators import workflow

@workflow(name="process_message")
def process_message():
    ... # user application logic
    LLMObs.annotate(
        input_data="<ARGUMENT>",
        output_data="<OUTPUT>",
        metadata={},
        metrics={"input_tokens": 15, "output_tokens": 24},
        tags={},
    )
    return
from ddtrace.llmobs import LLMObs

def process_message():
    with LLMObs.workflow() as span:
        ... # user application logic
        LLMObs.annotate(
            span=span,
            input_data="<ARGUMENT>",
            output_data="<OUTPUT>",
            metadata={},
            metrics={"input_tokens": 15, "output_tokens": 24},
            tags={},
        )
    return

For more information on alternative tracing methods and tracing features, see the SDK documentation.

Advanced tracing

Depending on the complexity of your LLM application, you can also: