LiteLLM

Supported OS Linux Windows Mac OS

통합 버전1.0.0
이 페이지는 아직 영어로 제공되지 않습니다. 번역 작업 중입니다.
현재 번역 프로젝트에 대한 질문이나 피드백이 있으신 경우 언제든지 연락주시기 바랍니다.

Overview

This check monitors LiteLLM through the Datadog Agent.

Include a high level overview of what this integration does:

  • What does your product do (in 1-2 sentences)?
  • What value will customers get from this integration, and why is it valuable to them?
  • What specific data will your integration monitor, and what’s the value of that data?

Setup

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.

Installation

The LiteLLM check is included in the Datadog Agent package. No additional installation is needed on your server.

Configuration

  1. Edit the litellm.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your litellm performance data. See the sample litellm.d/conf.yaml for all available configuration options.

  2. Restart the Agent.

Validation

Run the Agent’s status subcommand and look for litellm under the Checks section.

Data Collected

Metrics

litellm.api.key.budget.remaining_hours.metric
(gauge)
Remaining hours for api key budget to be reset
litellm.api.key.max_budget.metric
(gauge)
Maximum budget set for api key
litellm.auth.failed_requests.count
(count)
Total failed_requests for auth service
litellm.auth.latency.bucket
(count)
Latency for auth service
litellm.auth.latency.count
(count)
Latency for auth service
litellm.auth.latency.sum
(count)
Latency for auth service
litellm.auth.total_requests.count
(count)
Total total_requests for auth service
litellm.batch_write_to_db.failed_requests.count
(count)
Total failed_requests for batch_write_to_db service
litellm.batch_write_to_db.latency.bucket
(count)
Latency for batch_write_to_db service
litellm.batch_write_to_db.latency.count
(count)
Latency for batch_write_to_db service
litellm.batch_write_to_db.latency.sum
(count)
Latency for batch_write_to_db service
litellm.batch_write_to_db.total_requests.count
(count)
Total total_requests for batch_write_to_db service
litellm.deployment.cooled_down.count
(count)
Number of times a deployment has been cooled down by LiteLLM load balancing logic. exception_status is the status of the exception that caused the deployment to be cooled down
litellm.deployment.failed_fallbacks.count
(count)
Number of failed fallback requests from primary model -> fallback model
litellm.deployment.failure_by_tag_responses.count
(count)
Total number of failed LLM API calls for a specific LLM deploymeny by custom metadata tags
litellm.deployment.failure_responses.count
(count)
Total number of failed LLM API calls for a specific LLM deploymeny. exception_status is the status of the exception from the llm api
litellm.deployment.latency_per_output_token.bucket
(count)
Latency per output token
litellm.deployment.latency_per_output_token.count
(count)
Latency per output token
litellm.deployment.latency_per_output_token.sum
(count)
Latency per output token
litellm.deployment.state
(gauge)
The state of the deployment: 0 = healthy, 1 = partial outage, 2 = complete outage
litellm.deployment.success_responses.count
(count)
Total number of successful LLM API calls via litellm
litellm.deployment.successful_fallbacks.count
(count)
Number of successful fallback requests from primary model -> fallback model
litellm.deployment.total_requests.count
(count)
Total number of LLM API calls via litellm - success + failure
litellm.endpoint.healthy_count
(count)
Number of healthy endpoints
litellm.endpoint.info
(gauge)
LiteLLM Health Endpoint info metric that is tagged by endpoint_health, llm_model, custom_llm_provider, error_type
litellm.endpoint.unhealthy_count
(count)
Number of unhealthy endpoints
litellm.in_memory.daily_spend_update_queue.size
(gauge)
Gauge for in_memory_daily_spend_update_queue service
litellm.in_memory.spend_update_queue.size
(gauge)
Gauge for in_memory_spend_update_queue service
litellm.input.tokens.count
(count)
Total number of input tokens from LLM requests
litellm.llm.api.failed_requests.metric.count
(count)
Deprecated - use litellm.proxy.failed_requests.metric. Total number of failed responses from proxy - the client did not get a success response from litellm proxy
litellm.llm.api.latency.metric.bucket
(count)
Total latency (seconds) for a models LLM API call
litellm.llm.api.latency.metric.count
(count)
Total latency (seconds) for a models LLM API call
litellm.llm.api.latency.metric.sum
(count)
Total latency (seconds) for a models LLM API call
litellm.llm.api.time_to_first_token.metric.bucket
(count)
Time to first token for a models LLM API call
litellm.llm.api.time_to_first_token.metric.count
(count)
Time to first token for a models LLM API call
litellm.llm.api.time_to_first_token.metric.sum
(count)
Time to first token for a models LLM API call
litellm.output.tokens.count
(count)
Total number of output tokens from LLM requests
litellm.overhead_latency.metric.bucket
(count)
Latency overhead (milliseconds) added by LiteLLM processing
litellm.overhead_latency.metric.count
(count)
Latency overhead (milliseconds) added by LiteLLM processing
litellm.overhead_latency.metric.sum
(count)
Latency overhead (milliseconds) added by LiteLLM processing
litellm.pod_lock_manager.size
(gauge)
Gauge for pod_lock_manager service
litellm.postgres.failed_requests.count
(count)
Total failed_requests for postgres service
litellm.postgres.latency.bucket
(count)
Latency for postgres service
litellm.postgres.latency.count
(count)
Latency for postgres service
litellm.postgres.latency.sum
(count)
Latency for postgres service
litellm.postgres.total_requests.count
(count)
Total total_requests for postgres service
litellm.process.uptime.seconds
(gauge)
Start time of the process since unix epoch in seconds.
litellm.provider.remaining_budget.metric
(gauge)
Remaining budget for provider - used when you set provider budget limits
litellm.proxy.failed_requests.metric.count
(count)
Total number of failed responses from proxy - the client did not get a success response from litellm proxy
litellm.proxy.pre_call.failed_requests.count
(count)
Total failed_requests for proxy_pre_call service
litellm.proxy.pre_call.latency.bucket
(count)
Latency for proxy_pre_call service
litellm.proxy.pre_call.latency.count
(count)
Latency for proxy_pre_call service
litellm.proxy.pre_call.latency.sum
(count)
Latency for proxy_pre_call service
litellm.proxy.pre_call.total_requests.count
(count)
Total total_requests for proxy_pre_call service
litellm.proxy.total_requests.metric.count
(count)
Total number of requests made to the proxy server - track number of client side requests
litellm.redis.daily_spend_update_queue.size
(gauge)
Gauge for redis_daily_spend_update_queue service
litellm.redis.daily_tag_spend_update_queue.failed_requests.count
(count)
Total failed_requests for redis_daily_tag_spend_update_queue service
litellm.redis.daily_tag_spend_update_queue.latency.bucket
(count)
Latency for redis_daily_tag_spend_update_queue service
litellm.redis.daily_tag_spend_update_queue.latency.count
(count)
Latency for redis_daily_tag_spend_update_queue service
litellm.redis.daily_tag_spend_update_queue.latency.sum
(count)
Latency for redis_daily_tag_spend_update_queue service
litellm.redis.daily_tag_spend_update_queue.total_requests.count
(count)
Total total_requests for redis_daily_tag_spend_update_queue service
litellm.redis.daily_team_spend_update_queue.failed_requests.count
(count)
Total failed_requests for redis_daily_team_spend_update_queue service
litellm.redis.daily_team_spend_update_queue.latency.bucket
(count)
Latency for redis_daily_team_spend_update_queue service
litellm.redis.daily_team_spend_update_queue.latency.count
(count)
Latency for redis_daily_team_spend_update_queue service
litellm.redis.daily_team_spend_update_queue.latency.sum
(count)
Latency for redis_daily_team_spend_update_queue service
litellm.redis.daily_team_spend_update_queue.total_requests.count
(count)
Total total_requests for redis_daily_team_spend_update_queue service
litellm.redis.failed_requests.count
(count)
Total failed_requests for redis service
litellm.redis.latency.bucket
(count)
Latency for redis service
litellm.redis.spend_update_queue.size
(gauge)
Gauge for redis_spend_update_queue service
litellm.redis.total_requests.count
(count)
Total total_requests for redis service
litellm.remaining.api_key.budget.metric
(gauge)
Remaining budget for api key
litellm.remaining.api_key.requests_for_model
(gauge)
Remaining Requests API Key can make for model (model based rpm limit on key)
litellm.remaining.api_key.tokens_for_model
(gauge)
Remaining Tokens API Key can make for model (model based tpm limit on key)
litellm.remaining.requests
(gauge)
remaining requests for model,returned from LLM API Provider
litellm.remaining.team_budget.metric
(gauge)
Remaining budget for team
litellm.remaining_tokens
(gauge)
remaining tokens for model,returned from LLM API Provider
litellm.request.total_latency.metric.bucket
(count)
Total latency (seconds) for a request to LiteLLM
litellm.request.total_latency.metric.count
(count)
Total latency (seconds) for a request to LiteLLM
litellm.request.total_latency.metric.sum
(count)
Total latency (seconds) for a request to LiteLLM
litellm.requests.metric.count
(count)
Deprecated - use litellm.proxy.total_requests.metric.count. Total number of LLM calls to litellm - track total per API Key, team, user
litellm.reset_budget_job.failed_requests.count
(count)
Total failed_requests for reset_budget_job service
litellm.reset_budget_job.latency.bucket
(count)
Latency for reset_budget_job service
litellm.reset_budget_job.total_requests.count
(count)
Total total_requests for reset_budget_job service
litellm.router.failed_requests.count
(count)
Total failed_requests for router service
litellm.router.latency.bucket
(count)
Latency for router service
litellm.router.latency.count
(count)
Latency for router service
litellm.router.latency.sum
(count)
Latency for router service
litellm.router.total_requests.count
(count)
Total total_requests for router service
litellm.self.failed_requests.count
(count)
Total failed_requests for self service
litellm.self.latency.bucket
(count)
Latency for self service
litellm.self.latency.count
(count)
Latency for self service
litellm.self.latency.sum
(count)
Latency for self service
litellm.self.total_requests.count
(count)
Total total_requests for self service
litellm.spend.metric.count
(count)
Total spend on LLM requests
litellm.team.budget.remaining_hours.metric
(gauge)
Remaining days for team budget to be reset
litellm.team.max_budget.metric
(gauge)
Maximum budget set for team
litellm.total.tokens.count
(count)
Total number of input + output tokens from LLM requests

Events

The LiteLLM integration does not include any events.

Service Checks

litellm.openmetrics.health

Returns CRITICAL if the Agent is unable to connect to the LiteLLM OpenMetrics endpoint, otherwise returns OK.

Statuses: ok, critical

Troubleshooting

Need help? Contact Datadog support.