- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- Administrator's Guide
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Supported OS
This check monitors LiteLLM through the Datadog Agent.
Include a high level overview of what this integration does:
Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.
The LiteLLM check is included in the Datadog Agent package. No additional installation is needed on your server.
Edit the litellm.d/conf.yaml
file, in the conf.d/
folder at the root of your Agent’s configuration directory to start collecting your litellm performance data. See the sample litellm.d/conf.yaml for all available configuration options.
Run the Agent’s status subcommand and look for litellm
under the Checks section.
litellm.api.key.budget.remaining_hours.metric (gauge) | Remaining hours for api key budget to be reset |
litellm.api.key.max_budget.metric (gauge) | Maximum budget set for api key |
litellm.auth.failed_requests.count (count) | Total failed_requests for auth service |
litellm.auth.latency.bucket (count) | Latency for auth service |
litellm.auth.latency.count (count) | Latency for auth service |
litellm.auth.latency.sum (count) | Latency for auth service |
litellm.auth.total_requests.count (count) | Total total_requests for auth service |
litellm.batch_write_to_db.failed_requests.count (count) | Total failed_requests for batch_write_to_db service |
litellm.batch_write_to_db.latency.bucket (count) | Latency for batch_write_to_db service |
litellm.batch_write_to_db.latency.count (count) | Latency for batch_write_to_db service |
litellm.batch_write_to_db.latency.sum (count) | Latency for batch_write_to_db service |
litellm.batch_write_to_db.total_requests.count (count) | Total total_requests for batch_write_to_db service |
litellm.deployment.cooled_down.count (count) | Number of times a deployment has been cooled down by LiteLLM load balancing logic. exception_status is the status of the exception that caused the deployment to be cooled down |
litellm.deployment.failed_fallbacks.count (count) | Number of failed fallback requests from primary model -> fallback model |
litellm.deployment.failure_by_tag_responses.count (count) | Total number of failed LLM API calls for a specific LLM deploymeny by custom metadata tags |
litellm.deployment.failure_responses.count (count) | Total number of failed LLM API calls for a specific LLM deploymeny. exception_status is the status of the exception from the llm api |
litellm.deployment.latency_per_output_token.bucket (count) | Latency per output token |
litellm.deployment.latency_per_output_token.count (count) | Latency per output token |
litellm.deployment.latency_per_output_token.sum (count) | Latency per output token |
litellm.deployment.state (gauge) | The state of the deployment: 0 = healthy, 1 = partial outage, 2 = complete outage |
litellm.deployment.success_responses.count (count) | Total number of successful LLM API calls via litellm |
litellm.deployment.successful_fallbacks.count (count) | Number of successful fallback requests from primary model -> fallback model |
litellm.deployment.total_requests.count (count) | Total number of LLM API calls via litellm - success + failure |
litellm.endpoint.healthy_count (count) | Number of healthy endpoints |
litellm.endpoint.info (gauge) | LiteLLM Health Endpoint info metric that is tagged by endpoint_health, llm_model, custom_llm_provider, error_type |
litellm.endpoint.unhealthy_count (count) | Number of unhealthy endpoints |
litellm.in_memory.daily_spend_update_queue.size (gauge) | Gauge for in_memory_daily_spend_update_queue service |
litellm.in_memory.spend_update_queue.size (gauge) | Gauge for in_memory_spend_update_queue service |
litellm.input.tokens.count (count) | Total number of input tokens from LLM requests |
litellm.llm.api.failed_requests.metric.count (count) | Deprecated - use litellm.proxy.failed_requests.metric. Total number of failed responses from proxy - the client did not get a success response from litellm proxy |
litellm.llm.api.latency.metric.bucket (count) | Total latency (seconds) for a models LLM API call |
litellm.llm.api.latency.metric.count (count) | Total latency (seconds) for a models LLM API call |
litellm.llm.api.latency.metric.sum (count) | Total latency (seconds) for a models LLM API call |
litellm.llm.api.time_to_first_token.metric.bucket (count) | Time to first token for a models LLM API call |
litellm.llm.api.time_to_first_token.metric.count (count) | Time to first token for a models LLM API call |
litellm.llm.api.time_to_first_token.metric.sum (count) | Time to first token for a models LLM API call |
litellm.output.tokens.count (count) | Total number of output tokens from LLM requests |
litellm.overhead_latency.metric.bucket (count) | Latency overhead (milliseconds) added by LiteLLM processing |
litellm.overhead_latency.metric.count (count) | Latency overhead (milliseconds) added by LiteLLM processing |
litellm.overhead_latency.metric.sum (count) | Latency overhead (milliseconds) added by LiteLLM processing |
litellm.pod_lock_manager.size (gauge) | Gauge for pod_lock_manager service |
litellm.postgres.failed_requests.count (count) | Total failed_requests for postgres service |
litellm.postgres.latency.bucket (count) | Latency for postgres service |
litellm.postgres.latency.count (count) | Latency for postgres service |
litellm.postgres.latency.sum (count) | Latency for postgres service |
litellm.postgres.total_requests.count (count) | Total total_requests for postgres service |
litellm.process.uptime.seconds (gauge) | Start time of the process since unix epoch in seconds. |
litellm.provider.remaining_budget.metric (gauge) | Remaining budget for provider - used when you set provider budget limits |
litellm.proxy.failed_requests.metric.count (count) | Total number of failed responses from proxy - the client did not get a success response from litellm proxy |
litellm.proxy.pre_call.failed_requests.count (count) | Total failed_requests for proxy_pre_call service |
litellm.proxy.pre_call.latency.bucket (count) | Latency for proxy_pre_call service |
litellm.proxy.pre_call.latency.count (count) | Latency for proxy_pre_call service |
litellm.proxy.pre_call.latency.sum (count) | Latency for proxy_pre_call service |
litellm.proxy.pre_call.total_requests.count (count) | Total total_requests for proxy_pre_call service |
litellm.proxy.total_requests.metric.count (count) | Total number of requests made to the proxy server - track number of client side requests |
litellm.redis.daily_spend_update_queue.size (gauge) | Gauge for redis_daily_spend_update_queue service |
litellm.redis.daily_tag_spend_update_queue.failed_requests.count (count) | Total failed_requests for redis_daily_tag_spend_update_queue service |
litellm.redis.daily_tag_spend_update_queue.latency.bucket (count) | Latency for redis_daily_tag_spend_update_queue service |
litellm.redis.daily_tag_spend_update_queue.latency.count (count) | Latency for redis_daily_tag_spend_update_queue service |
litellm.redis.daily_tag_spend_update_queue.latency.sum (count) | Latency for redis_daily_tag_spend_update_queue service |
litellm.redis.daily_tag_spend_update_queue.total_requests.count (count) | Total total_requests for redis_daily_tag_spend_update_queue service |
litellm.redis.daily_team_spend_update_queue.failed_requests.count (count) | Total failed_requests for redis_daily_team_spend_update_queue service |
litellm.redis.daily_team_spend_update_queue.latency.bucket (count) | Latency for redis_daily_team_spend_update_queue service |
litellm.redis.daily_team_spend_update_queue.latency.count (count) | Latency for redis_daily_team_spend_update_queue service |
litellm.redis.daily_team_spend_update_queue.latency.sum (count) | Latency for redis_daily_team_spend_update_queue service |
litellm.redis.daily_team_spend_update_queue.total_requests.count (count) | Total total_requests for redis_daily_team_spend_update_queue service |
litellm.redis.failed_requests.count (count) | Total failed_requests for redis service |
litellm.redis.latency.bucket (count) | Latency for redis service |
litellm.redis.spend_update_queue.size (gauge) | Gauge for redis_spend_update_queue service |
litellm.redis.total_requests.count (count) | Total total_requests for redis service |
litellm.remaining.api_key.budget.metric (gauge) | Remaining budget for api key |
litellm.remaining.api_key.requests_for_model (gauge) | Remaining Requests API Key can make for model (model based rpm limit on key) |
litellm.remaining.api_key.tokens_for_model (gauge) | Remaining Tokens API Key can make for model (model based tpm limit on key) |
litellm.remaining.requests (gauge) | remaining requests for model,returned from LLM API Provider |
litellm.remaining.team_budget.metric (gauge) | Remaining budget for team |
litellm.remaining_tokens (gauge) | remaining tokens for model,returned from LLM API Provider |
litellm.request.total_latency.metric.bucket (count) | Total latency (seconds) for a request to LiteLLM |
litellm.request.total_latency.metric.count (count) | Total latency (seconds) for a request to LiteLLM |
litellm.request.total_latency.metric.sum (count) | Total latency (seconds) for a request to LiteLLM |
litellm.requests.metric.count (count) | Deprecated - use litellm.proxy.total_requests.metric.count. Total number of LLM calls to litellm - track total per API Key, team, user |
litellm.reset_budget_job.failed_requests.count (count) | Total failed_requests for reset_budget_job service |
litellm.reset_budget_job.latency.bucket (count) | Latency for reset_budget_job service |
litellm.reset_budget_job.total_requests.count (count) | Total total_requests for reset_budget_job service |
litellm.router.failed_requests.count (count) | Total failed_requests for router service |
litellm.router.latency.bucket (count) | Latency for router service |
litellm.router.latency.count (count) | Latency for router service |
litellm.router.latency.sum (count) | Latency for router service |
litellm.router.total_requests.count (count) | Total total_requests for router service |
litellm.self.failed_requests.count (count) | Total failed_requests for self service |
litellm.self.latency.bucket (count) | Latency for self service |
litellm.self.latency.count (count) | Latency for self service |
litellm.self.latency.sum (count) | Latency for self service |
litellm.self.total_requests.count (count) | Total total_requests for self service |
litellm.spend.metric.count (count) | Total spend on LLM requests |
litellm.team.budget.remaining_hours.metric (gauge) | Remaining days for team budget to be reset |
litellm.team.max_budget.metric (gauge) | Maximum budget set for team |
litellm.total.tokens.count (count) | Total number of input + output tokens from LLM requests |
The LiteLLM integration does not include any events.
litellm.openmetrics.health
Returns CRITICAL
if the Agent is unable to connect to the LiteLLM OpenMetrics endpoint, otherwise returns OK
.
Statuses: ok, critical
Need help? Contact Datadog support.