- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- Administrator's Guide
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Supported OS
이 점검은 Kubernetes 컨트롤 플레인의 일부인 Kubernetes Scheduler를 모니터링합니다.
참고: 이 점검은 해당 서비스가 노출되지 않기 때문에 Amazon EKS 클러스터에 대한 데이터를 수집하지 않습니다.
Kubernetes Scheduler 점검은 Datadog Agent 패키지에 포함되어 있으므로 서버에 추가 설치가 필요하지 않습니다.
아래 파라미터 적용에 대한 지침은 Autodiscovery 통합 템플릿을 참조하세요.
Agent의 설정 디렉터리 루트에서 conf.d/
폴더에 있는 kube_scheduler.d/conf.yaml
파일을 편집하여 kube_scheduler 성능 데이터 수집을 시작합니다. 사용 가능한 모든 설정 옵션은 샘플 kube_scheduler.d/conf.yaml을 참조하세요.
Datadog Agent에서는 로그 수집이 기본적으로 비활성화되어 있습니다. 활성화하려면 Kubernetes 로그 수집을 참조하세요.
파라미터 | 값 |
---|---|
<LOG_CONFIG> | {"source": "kube_scheduler", "service": "<SERVICE_NAME>"} |
Agent의 상태 하위 명령을 실행하고 Checks 섹션에서 kube_scheduler
를 찾습니다.
kube_scheduler.binding_duration.count (gauge) | Number of latency in seconds |
kube_scheduler.binding_duration.sum (gauge) | Total binding latency in seconds |
kube_scheduler.cache.lookups (count) | Number of equivalence cache lookups, by whether or not a cache entry was found |
kube_scheduler.client.http.requests (count) | Number of HTTP requests, partitioned by status code, method, and host |
kube_scheduler.client.http.requests_duration.count (gauge) | Number of client requests. Broken down by verb and URL |
kube_scheduler.client.http.requests_duration.sum (gauge) | Total latency. Broken down by verb and URL |
kube_scheduler.gc_duration_seconds.count (gauge) | Number of the GC invocation |
kube_scheduler.gc_duration_seconds.quantile (gauge) | GC invocation durations quantiles |
kube_scheduler.gc_duration_seconds.sum (gauge) | GC invocation durations sum |
kube_scheduler.goroutine_by_scheduling_operation (gauge) | Number of running goroutines split by the work they do such as binding (alpha; requires k8s v1.26+) |
kube_scheduler.goroutines (gauge) | Number of goroutines that currently exist |
kube_scheduler.max_fds (gauge) | Maximum allowed open file descriptors |
kube_scheduler.open_fds (gauge) | Number of open file descriptors |
kube_scheduler.pending_pods (gauge) | Number of pending pods, by the queue type (requires k8s v1.15+) |
kube_scheduler.pod_preemption.attempts (count) | Number of preemption attempts in the cluster till now |
kube_scheduler.pod_preemption.victims.count (gauge) | Number of selected pods during the latest preemption round |
kube_scheduler.pod_preemption.victims.sum (gauge) | Total selected pods during the latest preemption round |
kube_scheduler.queue.incoming_pods (count) | Number of pods added to scheduling queues by event and queue type (requires k8s v1.17+) |
kube_scheduler.schedule_attempts (gauge) | Number of attempts to schedule pods, by the result. 'unschedulable' means a pod could not be scheduled, while 'error' means an internal scheduler problem. |
kube_scheduler.scheduling.algorithm.predicate_duration.count (gauge) | Number of scheduling algorithm predicate evaluation |
kube_scheduler.scheduling.algorithm.predicate_duration.sum (gauge) | Total scheduling algorithm predicate evaluation duration |
kube_scheduler.scheduling.algorithm.preemption_duration.count (gauge) | Number of scheduling algorithm preemption evaluation |
kube_scheduler.scheduling.algorithm.preemption_duration.sum (gauge) | Total scheduling algorithm preemption evaluation duration |
kube_scheduler.scheduling.algorithm.priority_duration.count (gauge) | Number of scheduling algorithm priority evaluation |
kube_scheduler.scheduling.algorithm.priority_duration.sum (gauge) | Total scheduling algorithm priority evaluation duration |
kube_scheduler.scheduling.algorithm_duration.count (gauge) | Number of scheduling algorithm latency |
kube_scheduler.scheduling.algorithm_duration.sum (gauge) | Total scheduling algorithm latency |
kube_scheduler.scheduling.attempt_duration.count (gauge) | Scheduling attempt latency in seconds (scheduling algorithm + binding) (requires k8s v1.23+) |
kube_scheduler.scheduling.attempt_duration.sum (gauge) | Total scheduling attempt latency in seconds (scheduling algorithm + binding) (requires k8s v1.23+) |
kube_scheduler.scheduling.e2e_scheduling_duration.count (gauge) | Number of E2e scheduling latency (scheduling algorithm + binding) |
kube_scheduler.scheduling.e2e_scheduling_duration.sum (gauge) | Total E2e scheduling latency (scheduling algorithm + binding) |
kube_scheduler.scheduling.pod.scheduling_attempts.count (gauge) | Number of attempts to successfully schedule a pod (requires k8s v1.23+) |
kube_scheduler.scheduling.pod.scheduling_attempts.sum (gauge) | Total number of attempts to successfully schedule a pod (requires k8s v1.23+) |
kube_scheduler.scheduling.pod.scheduling_duration.count (gauge) | E2e latency for a pod being scheduled which may include multiple scheduling attempts (requires k8s v1.23+) |
kube_scheduler.scheduling.pod.scheduling_duration.sum (gauge) | Total e2e latency for a pod being scheduled which may include multiple scheduling attempts (requires k8s v1.23+) |
kube_scheduler.scheduling.scheduling_duration.count (gauge) | Number of scheduling split by sub-parts of the scheduling operation |
kube_scheduler.scheduling.scheduling_duration.quantile (gauge) | Scheduling latency quantiles split by sub-parts of the scheduling operation |
kube_scheduler.scheduling.scheduling_duration.sum (gauge) | Total scheduling latency split by sub-parts of the scheduling operation |
kube_scheduler.slis.kubernetes_healthcheck (gauge) | Result of a single scheduler healthcheck (alpha; requires k8s v1.26+) |
kube_scheduler.slis.kubernetes_healthcheck_total (count) | Cumulative results of all scheduler healthchecks (alpha; requires k8s v1.26+) |
kube_scheduler.threads (gauge) | Number of OS threads created |
kube_scheduler.volume_scheduling_duration.count (gauge) | Number of Volume scheduling |
kube_scheduler.volume_scheduling_duration.sum (gauge) | Total Volume scheduling stage latency |
Kube Scheduler는 이벤트를 포함하지 않습니다.
kube_scheduler.prometheus.health
Returns CRITICAL
if the check cannot access the metrics endpoint.
Statuses: ok, critical
kube_scheduler.leader_election.status
Returns CRITICAL
if no replica is currently set as leader.
Statuses: ok, critical
kube_scheduler.up
Returns CRITICAL
if Kube Scheduler is not healthy.
Statuses: ok, critical
도움이 필요하신가요? Datadog 고객 지원팀에 문의하세요.