- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- Administrator's Guide
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Supported OS
이 점검은 Kube_apiserver_metrics를 모니터링합니다.
Kube_apiserver_metrics 점검은 Datadog Agent 패키지에 포함되어 있으므로 서버에 추가 설치할 필요가 없습니다.
쿠버네티스(Kubernetes) 클러스터에 마스터 노드가 있고 kube-apiserver
이미지에 대한 포드와 컨테이너 를 실행 중인 경우, Datadog 에이전트가 자동으로 해당 포드를 감지하고 kube_apiserver_metrics.d/auto_conf.yaml
파일에 관한 통합을 설정합니다.
그러나 GKE, EKS 또는 AKS와 같은 관리형 쿠버네티스(Kubernetes) 배포를 사용하는 경우, 에이전트가 감지할 수 있는 실행 중인 kube-apiserver
포드가 없을 수도 있습니다.
해당 경우 default
네임스페이스에서 kubernetes
서비스에 대한 통합을 설정할 수 있습니다.
kube_apiserver_metrics
점검을 실행하는 주요 사용 사례는 클러스터 레벨 점검입니다.파라미터 | 값 |
---|---|
<INTEGRATION_NAME> | ["kube_apiserver_metrics"] |
<INIT_CONFIG> | [{}] |
<INSTANCE_CONFIG> | [{"prometheus_url": "https://%%host%%:%%port%%/metrics"}] |
사용 가능한 모든 설정 옵션은 kube_apiserver_metrics.yaml에서 검토할 수 있습니다.
default
네임스페이스의 쿠버네티스(Kubernetes) 서비스에 다음과 같이 어노테이션할 수 있습니다.
ad.datadoghq.com/endpoints.checks: |
{
"kube_apiserver_metrics": {
"instances": [
{
"prometheus_url": "https://%%host%%:%%port%%/metrics"
}
]
}
}
annotations:
ad.datadoghq.com/endpoints.check_names: '["kube_apiserver_metrics"]'
ad.datadoghq.com/endpoints.init_configs: '[{}]'
ad.datadoghq.com/endpoints.instances:
'[{ "prometheus_url": "https://%%host%%:%%port%%/metrics"}]'
그런 다음 Datadog Cluster Agent는 각 엔드포인트에 대한 점검을 Datadog Agent에 예약합니다.
에이전트의 설정 디렉토리의 루트에 있는 conf.d/
폴더의 kube_apiserver_metrics.yaml
파일에서 직접 엔드포인트를 설정하여 클러스터 점검으로 디스패칭하도록 설정하여 해당 점검을 실행할 수도 있습니다.
참고: 로컬 파일 또는 ConfigMap을 사용하는 경우 설정 파일에 cluster_check: true
를 추가하여 클러스터 점검을 설정합니다.
클러스터 에이전트에 설정를 제공하여 클러스터 점검을 설정합니다.
clusterAgent:
confd:
kube_apiserver_metrics.yaml: |-
advanced_ad_identifiers:
- kube_endpoints:
name: "kubernetes"
namespace: "default"
cluster_check: true
init_config:
instances:
- prometheus_url: "https://%%host%%:%%port%%/metrics"
spec:
#(...)
override:
clusterAgent:
extraConfd:
configDataMap:
kube_apiserver_metrics.yaml: |-
advanced_ad_identifiers:
- kube_endpoints:
name: "kubernetes"
namespace: "default"
cluster_check: true
init_config:
instances:
- prometheus_url: "https://%%host%%:%%port%%/metrics"
해당 설정은 에이전트를 트리거하여 정의된 엔드포인트 IP 주소 및 정의된 포트에서 default
네임스페이스의 kubernetes
서비스에 요청합니다.
에이전트의 상태 하위 명령을 실행하고 점검 섹션에서 kube_apiserver_metrics
를 찾습니다.
kube_apiserver.APIServiceRegistrationController_depth (gauge) | The current depth of workqueue: APIServiceRegistrationController |
kube_apiserver.admission_controller_admission_duration_seconds.count (count) | The admission controller latency histogram in seconds identified by name and broken out for each operation and API resource and type (validate or admit) count |
kube_apiserver.admission_controller_admission_duration_seconds.sum (gauge) | The admission controller latency histogram in seconds identified by name and broken out for each operation and API resource and type (validate or admit) Shown as second |
kube_apiserver.admission_step_admission_latencies_seconds.count (count) | The admission sub-step latency histogram broken out for each operation and API resource and step type (validate or admit) count |
kube_apiserver.admission_step_admission_latencies_seconds.sum (gauge) | The admission sub-step latency broken out for each operation and API resource and step type (validate or admit) Shown as second |
kube_apiserver.admission_step_admission_latencies_seconds_summary.count (count) | The admission sub-step latency summary broken out for each operation and API resource and step type (validate or admit) count |
kube_apiserver.admission_step_admission_latencies_seconds_summary.quantile (gauge) | The admission sub-step latency summary broken out for each operation and API resource and step type (validate or admit) quantile Shown as second |
kube_apiserver.admission_step_admission_latencies_seconds_summary.sum (gauge) | The admission sub-step latency summary broken out for each operation and API resource and step type (validate or admit) Shown as second |
kube_apiserver.admission_webhook_admission_latencies_seconds.count (count) | The admission webhook latency identified by name and broken out for each operation and API resource and type (validate or admit) count |
kube_apiserver.admission_webhook_admission_latencies_seconds.sum (gauge) | The admission webhook latency identified by name and broken out for each operation and API resource and type (validate or admit) Shown as second |
kube_apiserver.aggregator_unavailable_apiservice (gauge) | Gauge of APIServices which are marked as unavailable broken down by APIService name (alpha; Kubernetes 1.14+) |
kube_apiserver.apiserver_admission_webhook_fail_open_count (gauge) | Admission webhook fail open count, identified by name and broken out for each admission type (validating or mutating). |
kube_apiserver.apiserver_admission_webhook_fail_open_count.count (count) | Admission webhook fail open count, identified by name and broken out for each admission type (validating or mutating). |
kube_apiserver.apiserver_admission_webhook_request_total (gauge) | Admission webhook request total, identified by name and broken out for each admission type (alpha; Kubernetes 1.23+) |
kube_apiserver.apiserver_admission_webhook_request_total.count (count) | Admission webhook request total, identified by name and broken out for each admission type (alpha; Kubernetes 1.23+) |
kube_apiserver.apiserver_dropped_requests_total (gauge) | The accumulated number of requests dropped with 'Try again later' response Shown as request |
kube_apiserver.apiserver_dropped_requests_total.count (count) | The monotonic count of requests dropped with 'Try again later' response Shown as request |
kube_apiserver.apiserver_request_count (gauge) | The accumulated number of apiserver requests broken out for each verb API resource client and HTTP response contentType and code (deprecated in Kubernetes 1.15) Shown as request |
kube_apiserver.apiserver_request_count.count (count) | The monotonic count of apiserver requests broken out for each verb API resource client and HTTP response contentType and code (deprecated in Kubernetes 1.15) Shown as request |
kube_apiserver.apiserver_request_terminations_total.count (count) | The number of requests the apiserver terminated in self-defense (Kubernetes 1.17+) Shown as request |
kube_apiserver.apiserver_request_total (gauge) | The accumulated number of apiserver requests broken out for each verb API resource client and HTTP response contentType and code (Kubernetes 1.15+; replaces apiserverrequestcount) Shown as request |
kube_apiserver.apiserver_request_total.count (count) | The monotonic count of apiserver requests broken out for each verb API resource client and HTTP response contentType and code (Kubernetes 1.15+; replaces apiserverrequestcount.count) Shown as request |
kube_apiserver.audit_event (gauge) | The accumulated number audit events generated and sent to the audit backend Shown as event |
kube_apiserver.audit_event.count (count) | The monotonic count of audit events generated and sent to the audit backend Shown as event |
kube_apiserver.authenticated_user_requests (gauge) | The accumulated number of authenticated requests broken out by username Shown as request |
kube_apiserver.authenticated_user_requests.count (count) | The monotonic count of authenticated requests broken out by username Shown as request |
kube_apiserver.authentication_attempts.count (count) | The counter of authenticated attempts (Kubernetes 1.16+) Shown as request |
kube_apiserver.authentication_duration_seconds.count (count) | The authentication duration histogram broken out by result (Kubernetes 1.17+) |
kube_apiserver.authentication_duration_seconds.sum (gauge) | The authentication duration histogram broken out by result (Kubernetes 1.17+) Shown as second |
kube_apiserver.current_inflight_requests (gauge) | The maximal number of currently used inflight request limit of this apiserver per request kind in last second. |
kube_apiserver.envelope_encryption_dek_cache_fill_percent (gauge) | Percent of the cache slots currently occupied by cached DEKs. |
kube_apiserver.etcd.db.total_size (gauge) | The total size of the etcd database file physically allocated in bytes (alpha; Kubernetes 1.19+) Shown as byte |
kube_apiserver.etcd_object_counts (gauge) | The number of stored objects at the time of last check split by kind (alpha; deprecated in Kubernetes 1.22) Shown as object |
kube_apiserver.etcd_request_duration_seconds.count (count) | Etcd request latencies count for each operation and object type (alpha) |
kube_apiserver.etcd_request_duration_seconds.sum (gauge) | Etcd request latencies for each operation and object type (alpha) Shown as second |
kube_apiserver.etcd_request_errors_total (count) | Etcd failed request counts for each operation and object type Shown as request |
kube_apiserver.etcd_requests_total (count) | Etcd request counts for each operation and object type Shown as request |
kube_apiserver.flowcontrol_current_executing_requests (gauge) | Number of requests in initial (for a WATCH) or any (for a non-WATCH) execution stage in the API Priority and Fairness subsystem |
kube_apiserver.flowcontrol_current_inqueue_requests (count) | Number of requests currently pending in queues of the API Priority and Fairness subsystem |
kube_apiserver.flowcontrol_dispatched_requests_total (count) | Number of requests executed by API Priority and Fairness subsystem |
kube_apiserver.flowcontrol_rejected_requests_total.count (count) | Number of requests rejected by API Priority and Fairness subsystem |
kube_apiserver.flowcontrol_request_concurrency_limit (gauge) | Shared concurrency limit in the API Priority and Fairness subsystem |
kube_apiserver.go_goroutines (gauge) | The number of goroutines that currently exist |
kube_apiserver.go_threads (gauge) | The number of OS threads created Shown as thread |
kube_apiserver.grpc_client_handled_total (count) | The total number of RPCs completed by the client regardless of success or failure Shown as request |
kube_apiserver.grpc_client_msg_received_total (count) | The total number of gRPC stream messages received by the client Shown as message |
kube_apiserver.grpc_client_msg_sent_total (count) | The total number of gRPC stream messages sent by the client Shown as message |
kube_apiserver.grpc_client_started_total (count) | The total number of RPCs started on the client Shown as request |
kube_apiserver.http_requests_total (gauge) | The accumulated number of HTTP requests made Shown as request |
kube_apiserver.http_requests_total.count (count) | The monotonic count of the number of HTTP requests made Shown as request |
kube_apiserver.kubernetes_feature_enabled (gauge) | Whether a Kubernetes feature gate is enabled or not, identified by name and stage (alpha; Kubernetes 1.26+) |
kube_apiserver.longrunning_gauge (gauge) | The gauge of all active long-running apiserver requests broken out by verb, group, version, resource, scope, and component. Not all requests are tracked this way. Shown as request |
kube_apiserver.process_cpu_total (count) | Total user and system CPU time spent in seconds. Shown as second |
kube_apiserver.process_resident_memory_bytes (gauge) | The resident memory size in bytes Shown as byte |
kube_apiserver.process_virtual_memory_bytes (gauge) | The virtual memory size in bytes Shown as byte |
kube_apiserver.registered_watchers (gauge) | The number of currently registered watchers for a given resource Shown as object |
kube_apiserver.request_duration_seconds.count (count) | The response latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope, and component count |
kube_apiserver.request_duration_seconds.sum (gauge) | The response latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope, and component Shown as second |
kube_apiserver.request_latencies.count (count) | The response latency distribution in microseconds for each verb, resource, and subresource count |
kube_apiserver.request_latencies.sum (gauge) | The response latency distribution in microseconds for each verb, resource and subresource Shown as microsecond |
kube_apiserver.requested_deprecated_apis (gauge) | Gauge of deprecated APIs that have been requested, broken out by API group, version, resource, subresource, and removed_release Shown as request |
kube_apiserver.rest_client_request_latency_seconds.count (count) | The request latency in seconds broken down by verb and URL count |
kube_apiserver.rest_client_request_latency_seconds.sum (gauge) | The request latency in seconds broken down by verb and URL Shown as second |
kube_apiserver.rest_client_requests_total (gauge) | The accumulated number of HTTP requests partitioned by status code method and host Shown as request |
kube_apiserver.rest_client_requests_total.count (count) | The monotonic count of HTTP requests partitioned by status code method and host Shown as request |
kube_apiserver.slis.kubernetes_healthcheck (gauge) | Result of a single kubernetes apiserver healthcheck (alpha; requires k8s v1.26+) |
kube_apiserver.slis.kubernetes_healthcheck_total (count) | The monotonic count of all kubernetes apiserver healthchecks (alpha; requires k8s v1.26+) |
kube_apiserver.storage_list_evaluated_objects_total (gauge) | The number of objects tested in the course of serving a LIST request from storage (alpha; Kubernetes 1.23+) Shown as object |
kube_apiserver.storage_list_fetched_objects_total (gauge) | The number of objects read from storage in the course of serving a LIST request (alpha; Kubernetes 1.23+) Shown as object |
kube_apiserver.storage_list_returned_objects_total (gauge) | The number of objects returned for a LIST request from storage (alpha; Kubernetes 1.23+) Shown as object |
kube_apiserver.storage_list_total (gauge) | The number of LIST requests served from storage (alpha; Kubernetes 1.23+) Shown as object |
kube_apiserver.storage_objects (gauge) | The number of stored objects at the time of last check split by kind (Kubernetes 1.21+; replaces etcdobjectcounts) Shown as object |
kube_apiserver.watch_events_sizes.count (count) | The watch event size distribution (Kubernetes 1.16+) |
kube_apiserver.watch_events_sizes.sum (gauge) | The watch event size distribution (Kubernetes 1.16+) Shown as byte |
Kube_apiserver_metrics는 서비스 점검을 포함하지 않습니다.
Kube_apiserver_metrics는 이벤트를 포함하지 않습니다.
도움이 필요하신가요? Datadog 고객 지원팀에 문의하세요.