---
title: Evaluation compatibility
description: Learn about the compatibility requirements for evaluations.
breadcrumbs: Docs > LLM Observability > Evaluations > Evaluation compatibility
---

# Evaluation compatibility

{% callout %}
# Important note for users on the following Datadog sites: app.ddog-gov.com

{% alert level="danger" %}
This product is not supported for your selected [Datadog site](https://docs.datadoghq.com/getting_started/site.md). ().
{% /alert %}

{% /callout %}

## Evaluation compatibility{% #evaluation-compatibility %}

The supported third party LLM providers are OpenAI, Azure OpenAI, Anthropic, Amazon Bedrock, Vertex AI, and AI Gateway.

### Managed evaluations{% #managed-evaluations %}

Managed evaluations are supported for the following configurations.

| Evaluation                                                                                                             | DD-trace version | LLM Provider | Applicable span |
| ---------------------------------------------------------------------------------------------------------------------- | ---------------- | ------------ | --------------- |
| [Language Mismatch](https://docs.datadoghq.com/llm_observability/evaluations/managed_evaluations.md#language-mismatch) | Fully supported  | Self hosted  | All span kinds  |

### Custom LLM-as-a-judge evaluations{% #custom-llm-as-a-judge-evaluations %}

Custom LLM-as-a-judge evaluations are supported for the following configurations.

| Evaluation                                                                                                                                | DD-trace version | LLM Provider                  | Applicable span |
| ----------------------------------------------------------------------------------------------------------------------------------------- | ---------------- | ----------------------------- | --------------- |
| [Boolean](https://docs.datadoghq.com/llm_observability/evaluations/custom_llm_as_a_judge_evaluations.md#define-the-evaluation-output)     | Fully supported  | All third party LLM providers | All span kinds  |
| [Score](https://docs.datadoghq.com/llm_observability/evaluations/custom_llm_as_a_judge_evaluations.md#define-the-evaluation-output)       | Fully supported  | All third party LLM providers | All span kinds  |
| [Categorical](https://docs.datadoghq.com/llm_observability/evaluations/custom_llm_as_a_judge_evaluations.md#define-the-evaluation-output) | Fully supported  | All third party LLM providers | All span kinds  |
| [JSON](https://docs.datadoghq.com/llm_observability/evaluations/custom_llm_as_a_judge_evaluations.md#define-the-evaluation-output)        | Fully supported  | All third party LLM providers | All span kinds  |

#### Template LLM-as-a-judge evaluations{% #template-llm-as-a-judge-evaluations %}

Existing templates for custom LLM-as-a-judge evaluations are supported for the following configurations.

| Evaluation                                                                                                                                                                | DD-trace version | LLM Provider                  | Applicable span |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------- | ----------------------------- | --------------- |
| [Failure to Answer](https://docs.datadoghq.com/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/template_evaluations.md#failure-to-answer)                 | Fully supported  | All third party LLM providers | All span kinds  |
| [Hallucination](https://docs.datadoghq.com/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/template_evaluations.md#hallucination)                         | Fully supported  | All third party LLM providers | LLM only        |
| [Sentiment](https://docs.datadoghq.com/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/template_evaluations.md#sentiment)                                 | Fully supported  | All third party LLM providers | All span kinds  |
| [Toxicity](https://docs.datadoghq.com/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/template_evaluations.md#toxicity)                                   | Fully supported  | All third party LLM providers | All span kinds  |
| [Prompt Injection](https://docs.datadoghq.com/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/template_evaluations.md#prompt-injection)                   | Fully supported  | All third party LLM providers | All span kinds  |
| [Topic Relevancy](https://docs.datadoghq.com/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/template_evaluations.md#topic-relevancy)                     | Fully supported  | All third party LLM providers | All span kinds  |
| [Tool Selection](https://docs.datadoghq.com/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/template_evaluations.md#tool-selection)                       | Fully supported  | All third party LLM providers | LLM only        |
| [Tool Argument Correctness](https://docs.datadoghq.com/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/template_evaluations.md#tool-argument-correctness) | Fully supported  | All third party LLM providers | LLM only        |
| [Goal Completeness](https://docs.datadoghq.com/llm_observability/evaluations/custom_llm_as_a_judge_evaluations/template_evaluations.md#goal-completeness)                 | Fully supported  | All third party LLM providers | LLM only        |
