This page is not yet available in Spanish. We are working on its translation.
If you have any questions or feedback about our current translation project,
feel free to reach out to us!Overview
You can identify drifts in your LLM applications by visualizing trace data in clusters on the Clusters page. Select an application configured with LLM Observability to view cluster information.
Cluster Maps display inputs or outputs, grouped by topic. Inputs and outputs are clustered separately. Topics are determined by clustering the selected input or output into text embeddings in high dimensions, then projecting them into a 2D space.
You can visualize the clusters by using a Box Packing or Scatter Plot layout.
- Box Packing gives you a grouped view of each of the clusters and overlays any metrics or evaluations on every trace.
- Scatter Plot, on the other hand, allows you to view the high dimensional text embeddings in a 2D space, although the distance between each trace may be misleading due to projection distortion.
Cluster Maps provide an overview of each cluster’s performance across operational metrics, such as error types and latency, or out-of-the-box or custom evaluations, enabling you to identify trends such as topic drift and additional quality issues.
Search and manage clusters
Customize your search query by selecting the sorting options to narrow down the clusters based on your specific criteria, such as evaluation metrics or time periods, for more targeted analysis.
- Select
inputs
or outputs
from the dropdown menu to see clusters for inputs or outputs grouped by topic. - Select an evaluation type or an evaluation score to color-code the clusters. For example,
Output Sentiment
for “What is the sentiment of the output?” or duration
for “How long does it take for an LLM to generate an output (in nanoseconds)?” - Select a field for the clusters to be sorted by: time, duration, or color. Then, select desc or asc to set the order.
Select a topic cluster from the list to examine how inputs or outputs about specific topics perform against other topics for each metric or evaluation. You can also see individual prompts and responses for each cluster. For example, you can get an overview of your slowest topics when you overlay by duration
.
Further Reading
Additional helpful documentation, links, and articles: