Create or update a custom evaluator configuration

Note: This endpoint is in preview and is subject to change. If you have any feedback, contact Datadog support.

PUT https://api.ap1.datadoghq.com/api/unstable/llm-obs/config/evaluators/custom/{eval_name}https://api.ap2.datadoghq.com/api/unstable/llm-obs/config/evaluators/custom/{eval_name}https://api.datadoghq.eu/api/unstable/llm-obs/config/evaluators/custom/{eval_name}https://api.ddog-gov.com/api/unstable/llm-obs/config/evaluators/custom/{eval_name}https://api.us2.ddog-gov.com/api/unstable/llm-obs/config/evaluators/custom/{eval_name}https://api.datadoghq.com/api/unstable/llm-obs/config/evaluators/custom/{eval_name}https://api.us3.datadoghq.com/api/unstable/llm-obs/config/evaluators/custom/{eval_name}https://api.us5.datadoghq.com/api/unstable/llm-obs/config/evaluators/custom/{eval_name}

Overview

Create or update a custom LLM Observability evaluator configuration by its name.

Arguments

Path Parameters

Name

Type

Description

eval_name [required]

string

The name of the custom LLM Observability evaluator configuration.

Request

Body Data (required)

Custom evaluator configuration payload.

Expand All

Field

Type

Description

data [required]

object

Data object for creating or updating a custom LLM Observability evaluator configuration.

attributes [required]

object

Attributes for creating or updating a custom LLM Observability evaluator configuration.

category

string

Category of the evaluator.

eval_name

string

Name of the custom evaluator. If provided, must match the eval_name path parameter.

llm_judge_config

object

LLM judge configuration for a custom evaluator.

assessment_criteria

object

Criteria used to assess the pass/fail result of a custom evaluator.

max_threshold

double

Maximum numeric threshold for a passing result.

min_threshold

double

Minimum numeric threshold for a passing result.

pass_values

[string]

Specific output values considered as a passing result.

pass_when

boolean

When true, a boolean output of true is treated as passing.

inference_params [required]

object

LLM inference parameters for a custom evaluator.

frequency_penalty

double

Frequency penalty to reduce repetition.

max_tokens

int64

Maximum number of tokens to generate.

presence_penalty

double

Presence penalty to reduce repetition.

temperature

double

Sampling temperature for the LLM.

top_k

int64

Top-k sampling parameter.

top_p

double

Top-p (nucleus) sampling parameter.

last_used_library_prompt_template_name

string

Name of the last library prompt template used.

modified_library_prompt_template

boolean

Whether the library prompt template was modified.

output_schema

object

JSON schema describing the expected output format of the LLM judge.

parsing_type

enum

Output parsing type for a custom LLM judge evaluator. Allowed enum values: structured_output,json

prompt_template

[object]

List of messages forming the LLM judge prompt template.

content

string

Text content of the message.

contents

[object]

Multi-part content blocks for the message.

type [required]

string

Content block type.

value [required]

object

Value of a prompt message content block.

text

string

Text content of the message block.

tool_call

object

A tool call within a prompt message.

arguments

string

JSON-encoded arguments for the tool call.

id

string

Unique identifier of the tool call.

name

string

Name of the tool being called.

type

string

Type of the tool call.

tool_call_result

object

A tool call result within a prompt message.

name

string

Name of the tool that produced this result.

result

string

The result returned by the tool.

tool_id

string

Identifier of the tool call this result corresponds to.

type

string

Type of the tool result.

role [required]

string

Role of the message author.

llm_provider

object

LLM provider configuration for a custom evaluator.

bedrock

object

AWS Bedrock-specific options for LLM provider configuration.

region

string

AWS region for Bedrock.

integration_account_id

string

Integration account identifier.

integration_provider

enum

Name of the LLM integration provider. Allowed enum values: openai,amazon-bedrock,anthropic,azure-openai,vertex-ai,llm-proxy

model_name

string

Name of the LLM model.

vertex_ai

object

Google Vertex AI-specific options for LLM provider configuration.

location

string

Google Cloud region.

project

string

Google Cloud project ID.

target [required]

object

Target application configuration for a custom evaluator.

application_name [required]

string

Name of the ML application this evaluator targets.

enabled [required]

boolean

Whether the evaluator is active for the target application.

eval_scope

enum

Scope at which to evaluate spans. Allowed enum values: span,trace,session

filter

string

Filter expression to select which spans to evaluate.

root_spans_only

boolean

When true, only root spans are evaluated.

sampling_percentage

double

Percentage of traces to evaluate. Must be greater than 0 and at most 100.

id

string

Name of the evaluator. If provided, must match the eval_name path parameter.

type [required]

enum

Type of the custom LLM Observability evaluator configuration resource. Allowed enum values: evaluator_config

{
  "data": {
    "attributes": {
      "category": "Custom",
      "eval_name": "my-custom-evaluator",
      "llm_judge_config": {
        "assessment_criteria": {
          "max_threshold": 1,
          "min_threshold": 0.7,
          "pass_values": [
            "pass",
            "yes"
          ],
          "pass_when": true
        },
        "inference_params": {
          "frequency_penalty": 0,
          "max_tokens": 1024,
          "presence_penalty": 0,
          "temperature": 0.7,
          "top_k": 50,
          "top_p": 1
        },
        "last_used_library_prompt_template_name": "sentiment-analysis-v1",
        "modified_library_prompt_template": false,
        "output_schema": {},
        "parsing_type": "structured_output",
        "prompt_template": [
          {
            "content": "Rate the quality of the following response:",
            "contents": [
              {
                "type": "text",
                "value": {
                  "text": "What is the sentiment of this review?",
                  "tool_call": {
                    "arguments": "{\"location\": \"San Francisco\"}",
                    "id": "call_abc123",
                    "name": "get_weather",
                    "type": "function"
                  },
                  "tool_call_result": {
                    "name": "get_weather",
                    "result": "sunny, 72F",
                    "tool_id": "call_abc123",
                    "type": "function"
                  }
                }
              }
            ],
            "role": "user"
          }
        ]
      },
      "llm_provider": {
        "bedrock": {
          "region": "us-east-1"
        },
        "integration_account_id": "my-account-id",
        "integration_provider": "openai",
        "model_name": "gpt-4o",
        "vertex_ai": {
          "location": "us-central1",
          "project": "my-gcp-project"
        }
      },
      "target": {
        "application_name": "my-llm-app",
        "enabled": true,
        "eval_scope": "span",
        "filter": "@service:my-service",
        "root_spans_only": true,
        "sampling_percentage": 50
      }
    },
    "id": "my-custom-evaluator",
    "type": "evaluator_config"
  }
}

Response

OK

Bad Request

API error response.

Expand All

Field

Type

Description

errors [required]

[object]

A list of errors.

detail

string

A human-readable explanation specific to this occurrence of the error.

meta

object

Non-standard meta-information about the error

source

object

References to the source of the error.

header

string

A string indicating the name of a single request header which caused the error.

parameter

string

A string indicating which URI query parameter caused the error.

pointer

string

A JSON pointer to the value in the request document that caused the error.

status

string

Status code of the response.

title

string

Short human-readable summary of the error.

{
  "errors": [
    {
      "detail": "Missing required attribute in body",
      "meta": {},
      "source": {
        "header": "Authorization",
        "parameter": "limit",
        "pointer": "/data/attributes/title"
      },
      "status": "400",
      "title": "Bad Request"
    }
  ]
}

Unauthorized

API error response.

Expand All

Field

Type

Description

errors [required]

[object]

A list of errors.

detail

string

A human-readable explanation specific to this occurrence of the error.

meta

object

Non-standard meta-information about the error

source

object

References to the source of the error.

header

string

A string indicating the name of a single request header which caused the error.

parameter

string

A string indicating which URI query parameter caused the error.

pointer

string

A JSON pointer to the value in the request document that caused the error.

status

string

Status code of the response.

title

string

Short human-readable summary of the error.

{
  "errors": [
    {
      "detail": "Missing required attribute in body",
      "meta": {},
      "source": {
        "header": "Authorization",
        "parameter": "limit",
        "pointer": "/data/attributes/title"
      },
      "status": "400",
      "title": "Bad Request"
    }
  ]
}

Forbidden

API error response.

Expand All

Field

Type

Description

errors [required]

[object]

A list of errors.

detail

string

A human-readable explanation specific to this occurrence of the error.

meta

object

Non-standard meta-information about the error

source

object

References to the source of the error.

header

string

A string indicating the name of a single request header which caused the error.

parameter

string

A string indicating which URI query parameter caused the error.

pointer

string

A JSON pointer to the value in the request document that caused the error.

status

string

Status code of the response.

title

string

Short human-readable summary of the error.

{
  "errors": [
    {
      "detail": "Missing required attribute in body",
      "meta": {},
      "source": {
        "header": "Authorization",
        "parameter": "limit",
        "pointer": "/data/attributes/title"
      },
      "status": "400",
      "title": "Bad Request"
    }
  ]
}

Not Found

API error response.

Expand All

Field

Type

Description

errors [required]

[object]

A list of errors.

detail

string

A human-readable explanation specific to this occurrence of the error.

meta

object

Non-standard meta-information about the error

source

object

References to the source of the error.

header

string

A string indicating the name of a single request header which caused the error.

parameter

string

A string indicating which URI query parameter caused the error.

pointer

string

A JSON pointer to the value in the request document that caused the error.

status

string

Status code of the response.

title

string

Short human-readable summary of the error.

{
  "errors": [
    {
      "detail": "Missing required attribute in body",
      "meta": {},
      "source": {
        "header": "Authorization",
        "parameter": "limit",
        "pointer": "/data/attributes/title"
      },
      "status": "400",
      "title": "Bad Request"
    }
  ]
}

Unprocessable Entity

API error response.

Expand All

Field

Type

Description

errors [required]

[object]

A list of errors.

detail

string

A human-readable explanation specific to this occurrence of the error.

meta

object

Non-standard meta-information about the error

source

object

References to the source of the error.

header

string

A string indicating the name of a single request header which caused the error.

parameter

string

A string indicating which URI query parameter caused the error.

pointer

string

A JSON pointer to the value in the request document that caused the error.

status

string

Status code of the response.

title

string

Short human-readable summary of the error.

{
  "errors": [
    {
      "detail": "Missing required attribute in body",
      "meta": {},
      "source": {
        "header": "Authorization",
        "parameter": "limit",
        "pointer": "/data/attributes/title"
      },
      "status": "400",
      "title": "Bad Request"
    }
  ]
}

Too many requests

API error response.

Expand All

Field

Type

Description

errors [required]

[string]

A list of errors.

{
  "errors": [
    "Bad Request"
  ]
}

Code Example

                  ## default
# 

# Path parameters
export eval_name="my-custom-evaluator"
# Curl command
curl -X PUT "https://api.ap1.datadoghq.com"https://api.ap2.datadoghq.com"https://api.datadoghq.eu"https://api.ddog-gov.com"https://api.us2.ddog-gov.com"https://api.datadoghq.com"https://api.us3.datadoghq.com"https://api.us5.datadoghq.com/api/unstable/llm-obs/config/evaluators/custom/${eval_name}" \ -H "Accept: application/json" \ -H "Content-Type: application/json" \ -H "DD-API-KEY: ${DD_API_KEY}" \ -H "DD-APPLICATION-KEY: ${DD_APP_KEY}" \ -d @- << EOF { "data": { "attributes": { "llm_judge_config": { "inference_params": { "max_tokens": 1024, "temperature": 0.7 }, "parsing_type": "structured_output" }, "llm_provider": { "integration_provider": "openai", "model_name": "gpt-4o" }, "target": { "application_name": "my-llm-app", "enabled": true, "sampling_percentage": 50 } }, "id": "my-custom-evaluator", "type": "evaluator_config" } } EOF
## Full example with prompt template, output schema, and assessment criteria #
# Path parameters
export eval_name="my-custom-evaluator"
# Curl command
curl -X PUT "https://api.ap1.datadoghq.com"https://api.ap2.datadoghq.com"https://api.datadoghq.eu"https://api.ddog-gov.com"https://api.us2.ddog-gov.com"https://api.datadoghq.com"https://api.us3.datadoghq.com"https://api.us5.datadoghq.com/api/unstable/llm-obs/config/evaluators/custom/${eval_name}" \ -H "Accept: application/json" \ -H "Content-Type: application/json" \ -H "DD-API-KEY: ${DD_API_KEY}" \ -H "DD-APPLICATION-KEY: ${DD_APP_KEY}" \ -d @- << EOF { "data": { "attributes": { "category": "Custom", "eval_name": "my-custom-evaluator", "llm_judge_config": { "assessment_criteria": { "pass_when": false }, "inference_params": { "frequency_penalty": 0, "max_tokens": 4096, "presence_penalty": 0, "temperature": 1, "top_p": 1 }, "output_schema": { "name": "boolean_eval", "strict": true }, "parsing_type": "structured_output", "prompt_template": [ { "content": "You are a judge LLM.", "role": "system" }, { "content": "{{span_output}}", "role": "user" } ] }, "llm_provider": { "integration_account_id": "your-account-uuid", "integration_provider": "openai", "model_name": "gpt-4o" }, "target": { "application_name": "my-llm-app", "enabled": true, "eval_scope": "span", "filter": "@meta.span.kind:llm", "root_spans_only": false, "sampling_percentage": 100 } }, "id": "my-custom-evaluator", "type": "evaluator_config" } } EOF
"""
Create or update a custom evaluator configuration returns "OK" response
"""

from datadog_api_client import ApiClient, Configuration
from datadog_api_client.v2.api.llm_observability_api import LLMObservabilityApi
from datadog_api_client.v2.model.llm_obs_custom_eval_config_assessment_criteria import (
    LLMObsCustomEvalConfigAssessmentCriteria,
)
from datadog_api_client.v2.model.llm_obs_custom_eval_config_bedrock_options import LLMObsCustomEvalConfigBedrockOptions
from datadog_api_client.v2.model.llm_obs_custom_eval_config_eval_scope import LLMObsCustomEvalConfigEvalScope
from datadog_api_client.v2.model.llm_obs_custom_eval_config_inference_params import (
    LLMObsCustomEvalConfigInferenceParams,
)
from datadog_api_client.v2.model.llm_obs_custom_eval_config_integration_provider import (
    LLMObsCustomEvalConfigIntegrationProvider,
)
from datadog_api_client.v2.model.llm_obs_custom_eval_config_llm_judge_config import LLMObsCustomEvalConfigLLMJudgeConfig
from datadog_api_client.v2.model.llm_obs_custom_eval_config_llm_provider import LLMObsCustomEvalConfigLLMProvider
from datadog_api_client.v2.model.llm_obs_custom_eval_config_parsing_type import LLMObsCustomEvalConfigParsingType
from datadog_api_client.v2.model.llm_obs_custom_eval_config_prompt_content import LLMObsCustomEvalConfigPromptContent
from datadog_api_client.v2.model.llm_obs_custom_eval_config_prompt_content_value import (
    LLMObsCustomEvalConfigPromptContentValue,
)
from datadog_api_client.v2.model.llm_obs_custom_eval_config_prompt_message import LLMObsCustomEvalConfigPromptMessage
from datadog_api_client.v2.model.llm_obs_custom_eval_config_prompt_tool_call import LLMObsCustomEvalConfigPromptToolCall
from datadog_api_client.v2.model.llm_obs_custom_eval_config_prompt_tool_result import (
    LLMObsCustomEvalConfigPromptToolResult,
)
from datadog_api_client.v2.model.llm_obs_custom_eval_config_target import LLMObsCustomEvalConfigTarget
from datadog_api_client.v2.model.llm_obs_custom_eval_config_type import LLMObsCustomEvalConfigType
from datadog_api_client.v2.model.llm_obs_custom_eval_config_update_attributes import (
    LLMObsCustomEvalConfigUpdateAttributes,
)
from datadog_api_client.v2.model.llm_obs_custom_eval_config_update_data import LLMObsCustomEvalConfigUpdateData
from datadog_api_client.v2.model.llm_obs_custom_eval_config_update_request import LLMObsCustomEvalConfigUpdateRequest
from datadog_api_client.v2.model.llm_obs_custom_eval_config_vertex_ai_options import (
    LLMObsCustomEvalConfigVertexAIOptions,
)

body = LLMObsCustomEvalConfigUpdateRequest(
    data=LLMObsCustomEvalConfigUpdateData(
        attributes=LLMObsCustomEvalConfigUpdateAttributes(
            category="Custom",
            eval_name="my-custom-evaluator",
            llm_judge_config=LLMObsCustomEvalConfigLLMJudgeConfig(
                assessment_criteria=LLMObsCustomEvalConfigAssessmentCriteria(
                    max_threshold=1.0,
                    min_threshold=0.7,
                    pass_values=[
                        "pass",
                        "yes",
                    ],
                    pass_when=True,
                ),
                inference_params=LLMObsCustomEvalConfigInferenceParams(
                    frequency_penalty=0.0,
                    max_tokens=1024,
                    presence_penalty=0.0,
                    temperature=0.7,
                    top_k=50,
                    top_p=1.0,
                ),
                last_used_library_prompt_template_name="sentiment-analysis-v1",
                modified_library_prompt_template=False,
                output_schema=None,
                parsing_type=LLMObsCustomEvalConfigParsingType.STRUCTURED_OUTPUT,
                prompt_template=[
                    LLMObsCustomEvalConfigPromptMessage(
                        content="Rate the quality of the following response:",
                        contents=[
                            LLMObsCustomEvalConfigPromptContent(
                                type="text",
                                value=LLMObsCustomEvalConfigPromptContentValue(
                                    text="What is the sentiment of this review?",
                                    tool_call=LLMObsCustomEvalConfigPromptToolCall(
                                        arguments='{"location": "San Francisco"}',
                                        id="call_abc123",
                                        name="get_weather",
                                        type="function",
                                    ),
                                    tool_call_result=LLMObsCustomEvalConfigPromptToolResult(
                                        name="get_weather",
                                        result="sunny, 72F",
                                        tool_id="call_abc123",
                                        type="function",
                                    ),
                                ),
                            ),
                        ],
                        role="user",
                    ),
                ],
            ),
            llm_provider=LLMObsCustomEvalConfigLLMProvider(
                bedrock=LLMObsCustomEvalConfigBedrockOptions(
                    region="us-east-1",
                ),
                integration_account_id="my-account-id",
                integration_provider=LLMObsCustomEvalConfigIntegrationProvider.OPENAI,
                model_name="gpt-4o",
                vertex_ai=LLMObsCustomEvalConfigVertexAIOptions(
                    location="us-central1",
                    project="my-gcp-project",
                ),
            ),
            target=LLMObsCustomEvalConfigTarget(
                application_name="my-llm-app",
                enabled=True,
                eval_scope=LLMObsCustomEvalConfigEvalScope.SPAN,
                filter="@service:my-service",
                root_spans_only=True,
                sampling_percentage=50.0,
            ),
        ),
        id="my-custom-evaluator",
        type=LLMObsCustomEvalConfigType.EVALUATOR_CONFIG,
    ),
)

configuration = Configuration()
configuration.unstable_operations["update_llm_obs_custom_eval_config"] = True
with ApiClient(configuration) as api_client:
    api_instance = LLMObservabilityApi(api_client)
    api_instance.update_llm_obs_custom_eval_config(eval_name="eval_name", body=body)

Instructions

First install the library and its dependencies and then save the example to example.py and run following commands:

    
DD_SITE="datadoghq.comus3.datadoghq.comus5.datadoghq.comdatadoghq.euap1.datadoghq.comap2.datadoghq.comddog-gov.comus2.ddog-gov.com" DD_API_KEY="<DD_API_KEY>" DD_APP_KEY="<DD_APP_KEY>" python3 "example.py"
# Create or update a custom evaluator configuration returns "OK" response

require "datadog_api_client"
DatadogAPIClient.configure do |config|
  config.unstable_operations["v2.update_llm_obs_custom_eval_config".to_sym] = true
end
api_instance = DatadogAPIClient::V2::LLMObservabilityAPI.new

body = DatadogAPIClient::V2::LLMObsCustomEvalConfigUpdateRequest.new({
  data: DatadogAPIClient::V2::LLMObsCustomEvalConfigUpdateData.new({
    attributes: DatadogAPIClient::V2::LLMObsCustomEvalConfigUpdateAttributes.new({
      category: "Custom",
      eval_name: "my-custom-evaluator",
      llm_judge_config: DatadogAPIClient::V2::LLMObsCustomEvalConfigLLMJudgeConfig.new({
        assessment_criteria: DatadogAPIClient::V2::LLMObsCustomEvalConfigAssessmentCriteria.new({
          max_threshold: 1.0,
          min_threshold: 0.7,
          pass_values: [
            "pass",
            "yes",
          ],
          pass_when: true,
        }),
        inference_params: DatadogAPIClient::V2::LLMObsCustomEvalConfigInferenceParams.new({
          frequency_penalty: 0.0,
          max_tokens: 1024,
          presence_penalty: 0.0,
          temperature: 0.7,
          top_k: 50,
          top_p: 1.0,
        }),
        last_used_library_prompt_template_name: "sentiment-analysis-v1",
        modified_library_prompt_template: false,
        output_schema: nil,
        parsing_type: DatadogAPIClient::V2::LLMObsCustomEvalConfigParsingType::STRUCTURED_OUTPUT,
        prompt_template: [
          DatadogAPIClient::V2::LLMObsCustomEvalConfigPromptMessage.new({
            content: "Rate the quality of the following response:",
            contents: [
              DatadogAPIClient::V2::LLMObsCustomEvalConfigPromptContent.new({
                type: "text",
                value: DatadogAPIClient::V2::LLMObsCustomEvalConfigPromptContentValue.new({
                  text: "What is the sentiment of this review?",
                  tool_call: DatadogAPIClient::V2::LLMObsCustomEvalConfigPromptToolCall.new({
                    arguments: '{"location": "San Francisco"}',
                    id: "call_abc123",
                    name: "get_weather",
                    type: "function",
                  }),
                  tool_call_result: DatadogAPIClient::V2::LLMObsCustomEvalConfigPromptToolResult.new({
                    name: "get_weather",
                    result: "sunny, 72F",
                    tool_id: "call_abc123",
                    type: "function",
                  }),
                }),
              }),
            ],
            role: "user",
          }),
        ],
      }),
      llm_provider: DatadogAPIClient::V2::LLMObsCustomEvalConfigLLMProvider.new({
        bedrock: DatadogAPIClient::V2::LLMObsCustomEvalConfigBedrockOptions.new({
          region: "us-east-1",
        }),
        integration_account_id: "my-account-id",
        integration_provider: DatadogAPIClient::V2::LLMObsCustomEvalConfigIntegrationProvider::OPENAI,
        model_name: "gpt-4o",
        vertex_ai: DatadogAPIClient::V2::LLMObsCustomEvalConfigVertexAIOptions.new({
          location: "us-central1",
          project: "my-gcp-project",
        }),
      }),
      target: DatadogAPIClient::V2::LLMObsCustomEvalConfigTarget.new({
        application_name: "my-llm-app",
        enabled: true,
        eval_scope: DatadogAPIClient::V2::LLMObsCustomEvalConfigEvalScope::SPAN,
        filter: "@service:my-service",
        root_spans_only: true,
        sampling_percentage: 50.0,
      }),
    }),
    id: "my-custom-evaluator",
    type: DatadogAPIClient::V2::LLMObsCustomEvalConfigType::EVALUATOR_CONFIG,
  }),
})
p api_instance.update_llm_obs_custom_eval_config("eval_name", body)

Instructions

First install the library and its dependencies and then save the example to example.rb and run following commands:

    
DD_SITE="datadoghq.comus3.datadoghq.comus5.datadoghq.comdatadoghq.euap1.datadoghq.comap2.datadoghq.comddog-gov.comus2.ddog-gov.com" DD_API_KEY="<DD_API_KEY>" DD_APP_KEY="<DD_APP_KEY>" rb "example.rb"
// Create or update a custom evaluator configuration returns "OK" response

package main

import (
	"context"
	"fmt"
	"os"

	"github.com/DataDog/datadog-api-client-go/v2/api/datadog"
	"github.com/DataDog/datadog-api-client-go/v2/api/datadogV2"
)

func main() {
	body := datadogV2.LLMObsCustomEvalConfigUpdateRequest{
		Data: datadogV2.LLMObsCustomEvalConfigUpdateData{
			Attributes: datadogV2.LLMObsCustomEvalConfigUpdateAttributes{
				Category: datadog.PtrString("Custom"),
				EvalName: datadog.PtrString("my-custom-evaluator"),
				LlmJudgeConfig: &datadogV2.LLMObsCustomEvalConfigLLMJudgeConfig{
					AssessmentCriteria: &datadogV2.LLMObsCustomEvalConfigAssessmentCriteria{
						MaxThreshold: *datadog.NewNullableFloat64(datadog.PtrFloat64(1.0)),
						MinThreshold: *datadog.NewNullableFloat64(datadog.PtrFloat64(0.7)),
						PassValues: *datadog.NewNullableList(&[]string{
							"pass",
							"yes",
						}),
						PassWhen: *datadog.NewNullableBool(datadog.PtrBool(true)),
					},
					InferenceParams: datadogV2.LLMObsCustomEvalConfigInferenceParams{
						FrequencyPenalty: datadog.PtrFloat64(0.0),
						MaxTokens:        datadog.PtrInt64(1024),
						PresencePenalty:  datadog.PtrFloat64(0.0),
						Temperature:      datadog.PtrFloat64(0.7),
						TopK:             datadog.PtrInt64(50),
						TopP:             datadog.PtrFloat64(1.0),
					},
					LastUsedLibraryPromptTemplateName: *datadog.NewNullableString(datadog.PtrString("sentiment-analysis-v1")),
					ModifiedLibraryPromptTemplate:     *datadog.NewNullableBool(datadog.PtrBool(false)),
					OutputSchema:                      nil,
					ParsingType:                       datadogV2.LLMOBSCUSTOMEVALCONFIGPARSINGTYPE_STRUCTURED_OUTPUT.Ptr(),
					PromptTemplate: []datadogV2.LLMObsCustomEvalConfigPromptMessage{
						{
							Content: datadog.PtrString("Rate the quality of the following response:"),
							Contents: []datadogV2.LLMObsCustomEvalConfigPromptContent{
								{
									Type: "text",
									Value: datadogV2.LLMObsCustomEvalConfigPromptContentValue{
										Text: datadog.PtrString("What is the sentiment of this review?"),
										ToolCall: &datadogV2.LLMObsCustomEvalConfigPromptToolCall{
											Arguments: datadog.PtrString(`{"location": "San Francisco"}`),
											Id:        datadog.PtrString("call_abc123"),
											Name:      datadog.PtrString("get_weather"),
											Type:      datadog.PtrString("function"),
										},
										ToolCallResult: &datadogV2.LLMObsCustomEvalConfigPromptToolResult{
											Name:   datadog.PtrString("get_weather"),
											Result: datadog.PtrString("sunny, 72F"),
											ToolId: datadog.PtrString("call_abc123"),
											Type:   datadog.PtrString("function"),
										},
									},
								},
							},
							Role: "user",
						},
					},
				},
				LlmProvider: &datadogV2.LLMObsCustomEvalConfigLLMProvider{
					Bedrock: &datadogV2.LLMObsCustomEvalConfigBedrockOptions{
						Region: datadog.PtrString("us-east-1"),
					},
					IntegrationAccountId: datadog.PtrString("my-account-id"),
					IntegrationProvider:  datadogV2.LLMOBSCUSTOMEVALCONFIGINTEGRATIONPROVIDER_OPENAI.Ptr(),
					ModelName:            datadog.PtrString("gpt-4o"),
					VertexAi: &datadogV2.LLMObsCustomEvalConfigVertexAIOptions{
						Location: datadog.PtrString("us-central1"),
						Project:  datadog.PtrString("my-gcp-project"),
					},
				},
				Target: datadogV2.LLMObsCustomEvalConfigTarget{
					ApplicationName:    "my-llm-app",
					Enabled:            true,
					EvalScope:          datadogV2.LLMOBSCUSTOMEVALCONFIGEVALSCOPE_SPAN.Ptr(),
					Filter:             *datadog.NewNullableString(datadog.PtrString("@service:my-service")),
					RootSpansOnly:      *datadog.NewNullableBool(datadog.PtrBool(true)),
					SamplingPercentage: *datadog.NewNullableFloat64(datadog.PtrFloat64(50.0)),
				},
			},
			Id:   datadog.PtrString("my-custom-evaluator"),
			Type: datadogV2.LLMOBSCUSTOMEVALCONFIGTYPE_EVALUATOR_CONFIG,
		},
	}
	ctx := datadog.NewDefaultContext(context.Background())
	configuration := datadog.NewConfiguration()
	configuration.SetUnstableOperationEnabled("v2.UpdateLLMObsCustomEvalConfig", true)
	apiClient := datadog.NewAPIClient(configuration)
	api := datadogV2.NewLLMObservabilityApi(apiClient)
	r, err := api.UpdateLLMObsCustomEvalConfig(ctx, "eval_name", body)

	if err != nil {
		fmt.Fprintf(os.Stderr, "Error when calling `LLMObservabilityApi.UpdateLLMObsCustomEvalConfig`: %v\n", err)
		fmt.Fprintf(os.Stderr, "Full HTTP response: %v\n", r)
	}
}

Instructions

First install the library and its dependencies and then save the example to main.go and run following commands:

    
DD_SITE="datadoghq.comus3.datadoghq.comus5.datadoghq.comdatadoghq.euap1.datadoghq.comap2.datadoghq.comddog-gov.comus2.ddog-gov.com" DD_API_KEY="<DD_API_KEY>" DD_APP_KEY="<DD_APP_KEY>" go run "main.go"
// Create or update a custom evaluator configuration returns "OK" response

import com.datadog.api.client.ApiClient;
import com.datadog.api.client.ApiException;
import com.datadog.api.client.v2.api.LlmObservabilityApi;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigAssessmentCriteria;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigBedrockOptions;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigEvalScope;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigInferenceParams;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigIntegrationProvider;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigLLMJudgeConfig;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigLLMProvider;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigParsingType;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigPromptContent;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigPromptContentValue;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigPromptMessage;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigPromptToolCall;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigPromptToolResult;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigTarget;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigType;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigUpdateAttributes;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigUpdateData;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigUpdateRequest;
import com.datadog.api.client.v2.model.LLMObsCustomEvalConfigVertexAIOptions;
import java.util.Arrays;
import java.util.Collections;

public class Example {
  public static void main(String[] args) {
    ApiClient defaultClient = ApiClient.getDefaultApiClient();
    defaultClient.setUnstableOperationEnabled("v2.updateLLMObsCustomEvalConfig", true);
    LlmObservabilityApi apiInstance = new LlmObservabilityApi(defaultClient);

    LLMObsCustomEvalConfigUpdateRequest body =
        new LLMObsCustomEvalConfigUpdateRequest()
            .data(
                new LLMObsCustomEvalConfigUpdateData()
                    .attributes(
                        new LLMObsCustomEvalConfigUpdateAttributes()
                            .category("Custom")
                            .evalName("my-custom-evaluator")
                            .llmJudgeConfig(
                                new LLMObsCustomEvalConfigLLMJudgeConfig()
                                    .assessmentCriteria(
                                        new LLMObsCustomEvalConfigAssessmentCriteria()
                                            .maxThreshold(1.0)
                                            .minThreshold(0.7)
                                            .passValues(Arrays.asList("pass", "yes"))
                                            .passWhen(true))
                                    .inferenceParams(
                                        new LLMObsCustomEvalConfigInferenceParams()
                                            .maxTokens(1024L)
                                            .temperature(0.7)
                                            .topK(50L)
                                            .topP(1.0))
                                    .lastUsedLibraryPromptTemplateName("sentiment-analysis-v1")
                                    .modifiedLibraryPromptTemplate(false)
                                    .outputSchema(null)
                                    .parsingType(
                                        LLMObsCustomEvalConfigParsingType.STRUCTURED_OUTPUT)
                                    .promptTemplate(
                                        Collections.singletonList(
                                            new LLMObsCustomEvalConfigPromptMessage()
                                                .content(
                                                    "Rate the quality of the following response:")
                                                .contents(
                                                    Collections.singletonList(
                                                        new LLMObsCustomEvalConfigPromptContent()
                                                            .type("text")
                                                            .value(
                                                                new LLMObsCustomEvalConfigPromptContentValue()
                                                                    .text(
                                                                        "What is the sentiment of"
                                                                            + " this review?")
                                                                    .toolCall(
                                                                        new LLMObsCustomEvalConfigPromptToolCall()
                                                                            .arguments(
                                                                                """
{"location": "San Francisco"}
""")
                                                                            .id("call_abc123")
                                                                            .name("get_weather")
                                                                            .type("function"))
                                                                    .toolCallResult(
                                                                        new LLMObsCustomEvalConfigPromptToolResult()
                                                                            .name("get_weather")
                                                                            .result("sunny, 72F")
                                                                            .toolId("call_abc123")
                                                                            .type("function")))))
                                                .role("user"))))
                            .llmProvider(
                                new LLMObsCustomEvalConfigLLMProvider()
                                    .bedrock(
                                        new LLMObsCustomEvalConfigBedrockOptions()
                                            .region("us-east-1"))
                                    .integrationAccountId("my-account-id")
                                    .integrationProvider(
                                        LLMObsCustomEvalConfigIntegrationProvider.OPENAI)
                                    .modelName("gpt-4o")
                                    .vertexAi(
                                        new LLMObsCustomEvalConfigVertexAIOptions()
                                            .location("us-central1")
                                            .project("my-gcp-project")))
                            .target(
                                new LLMObsCustomEvalConfigTarget()
                                    .applicationName("my-llm-app")
                                    .enabled(true)
                                    .evalScope(LLMObsCustomEvalConfigEvalScope.SPAN)
                                    .filter("@service:my-service")
                                    .rootSpansOnly(true)
                                    .samplingPercentage(50.0)))
                    .id("my-custom-evaluator")
                    .type(LLMObsCustomEvalConfigType.EVALUATOR_CONFIG));

    try {
      apiInstance.updateLLMObsCustomEvalConfig("my-custom-evaluator", body);
    } catch (ApiException e) {
      System.err.println("Exception when calling LlmObservabilityApi#updateLLMObsCustomEvalConfig");
      System.err.println("Status code: " + e.getCode());
      System.err.println("Reason: " + e.getResponseBody());
      System.err.println("Response headers: " + e.getResponseHeaders());
      e.printStackTrace();
    }
  }
}

Instructions

First install the library and its dependencies and then save the example to Example.java and run following commands:

    
DD_SITE="datadoghq.comus3.datadoghq.comus5.datadoghq.comdatadoghq.euap1.datadoghq.comap2.datadoghq.comddog-gov.comus2.ddog-gov.com" DD_API_KEY="<DD_API_KEY>" DD_APP_KEY="<DD_APP_KEY>" java "Example.java"
// Create or update a custom evaluator configuration returns "OK" response
use datadog_api_client::datadog;
use datadog_api_client::datadogV2::api_llm_observability::LLMObservabilityAPI;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigAssessmentCriteria;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigBedrockOptions;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigEvalScope;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigInferenceParams;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigIntegrationProvider;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigLLMJudgeConfig;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigLLMProvider;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigParsingType;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigPromptContent;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigPromptContentValue;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigPromptMessage;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigPromptToolCall;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigPromptToolResult;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigTarget;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigType;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigUpdateAttributes;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigUpdateData;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigUpdateRequest;
use datadog_api_client::datadogV2::model::LLMObsCustomEvalConfigVertexAIOptions;

#[tokio::main]
async fn main() {
    let body = LLMObsCustomEvalConfigUpdateRequest::new(
        LLMObsCustomEvalConfigUpdateData::new(
            LLMObsCustomEvalConfigUpdateAttributes::new(
                LLMObsCustomEvalConfigTarget::new("my-llm-app".to_string(), true)
                    .eval_scope(LLMObsCustomEvalConfigEvalScope::SPAN)
                    .filter(Some("@service:my-service".to_string()))
                    .root_spans_only(Some(true))
                    .sampling_percentage(Some(50.0 as f64)),
            )
            .category("Custom".to_string())
            .eval_name("my-custom-evaluator".to_string())
            .llm_judge_config(
                LLMObsCustomEvalConfigLLMJudgeConfig::new(
                    LLMObsCustomEvalConfigInferenceParams::new()
                        .frequency_penalty(0.0 as f64)
                        .max_tokens(1024)
                        .presence_penalty(0.0 as f64)
                        .temperature(0.7 as f64)
                        .top_k(50)
                        .top_p(1.0 as f64),
                )
                .assessment_criteria(
                    LLMObsCustomEvalConfigAssessmentCriteria::new()
                        .max_threshold(Some(1.0 as f64))
                        .min_threshold(Some(0.7 as f64))
                        .pass_values(Some(vec!["pass".to_string(), "yes".to_string()]))
                        .pass_when(Some(true)),
                )
                .last_used_library_prompt_template_name(Some("sentiment-analysis-v1".to_string()))
                .modified_library_prompt_template(Some(false))
                .output_schema(None)
                .parsing_type(LLMObsCustomEvalConfigParsingType::STRUCTURED_OUTPUT)
                .prompt_template(vec![LLMObsCustomEvalConfigPromptMessage::new(
                    "user".to_string(),
                )
                .content("Rate the quality of the following response:".to_string())
                .contents(vec![LLMObsCustomEvalConfigPromptContent::new(
                    "text".to_string(),
                    LLMObsCustomEvalConfigPromptContentValue::new()
                        .text("What is the sentiment of this review?".to_string())
                        .tool_call(
                            LLMObsCustomEvalConfigPromptToolCall::new()
                                .arguments(r#"{"location": "San Francisco"}"#.to_string())
                                .id("call_abc123".to_string())
                                .name("get_weather".to_string())
                                .type_("function".to_string()),
                        )
                        .tool_call_result(
                            LLMObsCustomEvalConfigPromptToolResult::new()
                                .name("get_weather".to_string())
                                .result("sunny, 72F".to_string())
                                .tool_id("call_abc123".to_string())
                                .type_("function".to_string()),
                        ),
                )])]),
            )
            .llm_provider(
                LLMObsCustomEvalConfigLLMProvider::new()
                    .bedrock(
                        LLMObsCustomEvalConfigBedrockOptions::new().region("us-east-1".to_string()),
                    )
                    .integration_account_id("my-account-id".to_string())
                    .integration_provider(LLMObsCustomEvalConfigIntegrationProvider::OPENAI)
                    .model_name("gpt-4o".to_string())
                    .vertex_ai(
                        LLMObsCustomEvalConfigVertexAIOptions::new()
                            .location("us-central1".to_string())
                            .project("my-gcp-project".to_string()),
                    ),
            ),
            LLMObsCustomEvalConfigType::EVALUATOR_CONFIG,
        )
        .id("my-custom-evaluator".to_string()),
    );
    let mut configuration = datadog::Configuration::new();
    configuration.set_unstable_operation_enabled("v2.UpdateLLMObsCustomEvalConfig", true);
    let api = LLMObservabilityAPI::with_config(configuration);
    let resp = api
        .update_llm_obs_custom_eval_config("eval_name".to_string(), body)
        .await;
    if let Ok(value) = resp {
        println!("{:#?}", value);
    } else {
        println!("{:#?}", resp.unwrap_err());
    }
}

Instructions

First install the library and its dependencies and then save the example to src/main.rs and run following commands:

    
DD_SITE="datadoghq.comus3.datadoghq.comus5.datadoghq.comdatadoghq.euap1.datadoghq.comap2.datadoghq.comddog-gov.comus2.ddog-gov.com" DD_API_KEY="<DD_API_KEY>" DD_APP_KEY="<DD_APP_KEY>" cargo run
/**
 * Create or update a custom evaluator configuration returns "OK" response
 */

import { client, v2 } from "@datadog/datadog-api-client";

const configuration = client.createConfiguration();
configuration.unstableOperations["v2.updateLLMObsCustomEvalConfig"] = true;
const apiInstance = new v2.LLMObservabilityApi(configuration);

const params: v2.LLMObservabilityApiUpdateLLMObsCustomEvalConfigRequest = {
  body: {
    data: {
      attributes: {
        category: "Custom",
        evalName: "my-custom-evaluator",
        llmJudgeConfig: {
          assessmentCriteria: {
            maxThreshold: 1.0,
            minThreshold: 0.7,
            passValues: ["pass", "yes"],
            passWhen: true,
          },
          inferenceParams: {
            frequencyPenalty: 0.0,
            maxTokens: 1024,
            presencePenalty: 0.0,
            temperature: 0.7,
            topK: 50,
            topP: 1.0,
          },
          lastUsedLibraryPromptTemplateName: "sentiment-analysis-v1",
          modifiedLibraryPromptTemplate: false,
          outputSchema: undefined,
          parsingType: "structured_output",
          promptTemplate: [
            {
              content: "Rate the quality of the following response:",
              contents: [
                {
                  type: "text",
                  value: {
                    text: "What is the sentiment of this review?",
                    toolCall: {
                      arguments: `{"location": "San Francisco"}`,
                      id: "call_abc123",
                      name: "get_weather",
                      type: "function",
                    },
                    toolCallResult: {
                      name: "get_weather",
                      result: "sunny, 72F",
                      toolId: "call_abc123",
                      type: "function",
                    },
                  },
                },
              ],
              role: "user",
            },
          ],
        },
        llmProvider: {
          bedrock: {
            region: "us-east-1",
          },
          integrationAccountId: "my-account-id",
          integrationProvider: "openai",
          modelName: "gpt-4o",
          vertexAi: {
            location: "us-central1",
            project: "my-gcp-project",
          },
        },
        target: {
          applicationName: "my-llm-app",
          enabled: true,
          evalScope: "span",
          filter: "@service:my-service",
          rootSpansOnly: true,
          samplingPercentage: 50.0,
        },
      },
      id: "my-custom-evaluator",
      type: "evaluator_config",
    },
  },
  evalName: "eval_name",
};

apiInstance
  .updateLLMObsCustomEvalConfig(params)
  .then((data: any) => {
    console.log(
      "API called successfully. Returned data: " + JSON.stringify(data)
    );
  })
  .catch((error: any) => console.error(error));

Instructions

First install the library and its dependencies and then save the example to example.ts and run following commands:

    
DD_SITE="datadoghq.comus3.datadoghq.comus5.datadoghq.comdatadoghq.euap1.datadoghq.comap2.datadoghq.comddog-gov.comus2.ddog-gov.com" DD_API_KEY="<DD_API_KEY>" DD_APP_KEY="<DD_APP_KEY>" tsc "example.ts"