- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- Administrator's Guide
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Supported OS
본 점검은 Datadog 에이전트로 Datadog 클러스터 에이전트를 모니터링합니다.
아래 지침을 따라 호스트에서 실행되는 에이전트에 대해 이 점검을 설치하고 설정하세요. 컨테이너화된 환경의 경우 이러한 지침을 적용하는 데 가이드가 필요하면 자동탐지 통합 템플릿을 참조하세요.
Datadog 클러스터 에이전트 점검은 Datadog 에이전트 패키지에 포함됩니다. 서버에 추가 설치할 필요가 없습니다.
Datadog 클러스터 에이전트 점검은 대부분의 시나리오에서 자동탐지 기능으로 자동으로 자체 설정됩니다. 점검은 클러스터 에이전트 포드와 동일한 노드의 Datadog 에이전트 포드에서 실행됩니다. 클러스터 에이전트 자체에서는 실행되지 않습니다.
점검을 추가로 설정해야 하는 경우:
에이전트의 설정 디렉토리의 루트에 있는 conf.d/
폴더의 datadog_cluster_agent.d/conf.yaml
파일을 편집하여 datadog_cluster_agent 성능 데이터 수집을 시작합니다. 사용 가능한 모든 설정 옵션은 datadog_cluster_agent.d/conf.yaml 샘플을 참조하세요.
에이전트의 상태 하위 명령을 실행하고 점검 섹션에서 datadog_cluster_agent
를 검색합니다.
datadog.cluster_agent.admission_webhooks.certificate_expiry (gauge) | Time left before the certificate expires Shown as hour |
datadog.cluster_agent.admission_webhooks.cws_exec_instrumentation_attempts.count (count) | CWS exec Instrumentation attempts count |
datadog.cluster_agent.admission_webhooks.cws_exec_instrumentation_attempts.sum (count) | CWS exec Instrumentation attempts sum |
datadog.cluster_agent.admission_webhooks.cws_pod_instrumentation_attempts.count (count) | CWS pod Instrumentation attempts count |
datadog.cluster_agent.admission_webhooks.cws_pod_instrumentation_attempts.sum (count) | CWS pod Instrumentation attempts sum |
datadog.cluster_agent.admission_webhooks.library_injection_attempts (count) | Number of library injection attempts by language |
datadog.cluster_agent.admission_webhooks.library_injection_errors (count) | Number of library injection failures by language |
datadog.cluster_agent.admission_webhooks.mutation_attempts (gauge) | Number of pod mutation attempts by mutation type |
datadog.cluster_agent.admission_webhooks.mutation_errors (gauge) | Number of mutation failures by mutation type |
datadog.cluster_agent.admission_webhooks.patcher.attempts (count) | Number of patch attempts |
datadog.cluster_agent.admission_webhooks.patcher.completed (count) | Number of completed patch attempts |
datadog.cluster_agent.admission_webhooks.patcher.errors (count) | Number of patch errors |
datadog.cluster_agent.admission_webhooks.rc_provider.configs (gauge) | Number of valid remote configuration |
datadog.cluster_agent.admission_webhooks.rc_provider.invalid_configs (gauge) | Number of invalid remote configurations |
datadog.cluster_agent.admission_webhooks.reconcile_errors (gauge) | Number of reconcile errors per controller |
datadog.cluster_agent.admission_webhooks.reconcile_success (gauge) | Number of reconcile successes per controller Shown as success |
datadog.cluster_agent.admission_webhooks.response_duration.count (count) | Webhook response duration count |
datadog.cluster_agent.admission_webhooks.response_duration.sum (count) | Webhook response duration sum Shown as second |
datadog.cluster_agent.admission_webhooks.validation_attempts (gauge) | Number of pod validation attempts by validation type |
datadog.cluster_agent.admission_webhooks.webhooks_received (gauge) | Number of webhook requests received |
datadog.cluster_agent.aggregator.flush (count) | Number of metrics/service checks/events flushed by (data_type, state) |
datadog.cluster_agent.aggregator.processed (count) | Amount of metrics/serviceschecks/events processed by the aggregator by datatype |
datadog.cluster_agent.api_requests (count) | Requests made to the cluster agent API by (handler, status) Shown as request |
datadog.cluster_agent.autodiscovery.errors (gauge) | Number of Autodiscovery errors |
datadog.cluster_agent.autodiscovery.poll_duration.count (count) | Autodiscovery poll duration count |
datadog.cluster_agent.autodiscovery.poll_duration.sum (count) | Autodiscovery poll duration sum Shown as second |
datadog.cluster_agent.autodiscovery.watched_resources (gauge) | Number of watched resources (Services and Endpoints) |
datadog.cluster_agent.cluster_checks.busyness (gauge) | Busyness of a node per the number of metrics submitted and average duration of all checks run |
datadog.cluster_agent.cluster_checks.configs_dangling (gauge) | Number of check configurations not dispatched |
datadog.cluster_agent.cluster_checks.configs_dispatched (gauge) | Number of check configurations dispatched by node |
datadog.cluster_agent.cluster_checks.configs_info (gauge) | Information about check configurations dispatched (node and check ID) |
datadog.cluster_agent.cluster_checks.failed_stats_collection (count) | Total number of unsuccessful stats collection attempts |
datadog.cluster_agent.cluster_checks.nodes_reporting (gauge) | Number of node agents reporting |
datadog.cluster_agent.cluster_checks.rebalancing_decisions (count) | Total number of check rebalancing decisions |
datadog.cluster_agent.cluster_checks.rebalancing_duration_seconds (gauge) | Duration of the check rebalancing algorithm last execution Shown as second |
datadog.cluster_agent.cluster_checks.successful_rebalancing_moves (count) | Total number of successful check rebalancing decisions Shown as check |
datadog.cluster_agent.cluster_checks.unscheduled_check (gauge) | Number of check configurations not scheduled |
datadog.cluster_agent.cluster_checks.updating_stats_duration_seconds (gauge) | Duration of collecting stats from check runners and updating cache Shown as second |
datadog.cluster_agent.datadog.rate_limit_queries.limit (gauge) | Maximum number of queries to the Datadog API allowed in the period by endpoint Shown as query |
datadog.cluster_agent.datadog.rate_limit_queries.period (gauge) | Period of rate limiting for the Datadog API by endpoint Shown as second |
datadog.cluster_agent.datadog.rate_limit_queries.remaining (gauge) | Number of queries to the Datadog API remaining before next reset by endpoint Shown as query |
datadog.cluster_agent.datadog.rate_limit_queries.remaining_min (gauge) | Minimum number of queries remaining before next reset observed during an expiration interval of 2*refresh period Shown as query |
datadog.cluster_agent.datadog.rate_limit_queries.reset (gauge) | Number of seconds before next reset applied to the Datadog API by endpoint Shown as second |
datadog.cluster_agent.datadog.requests (count) | Requests made to Datadog by status Shown as request |
datadog.cluster_agent.endpoint_checks.configs_dispatched (gauge) | Number of endpoint-check configurations dispatched by node |
datadog.cluster_agent.external_metrics (gauge) | Number of external metrics tagged |
datadog.cluster_agent.external_metrics.api_elapsed.count (count) | Count of API Requests received |
datadog.cluster_agent.external_metrics.api_elapsed.sum (count) | Count of API Requests received |
datadog.cluster_agent.external_metrics.api_requests (gauge) | Count of API Requests received |
datadog.cluster_agent.external_metrics.datadog_metrics (gauge) | The label valid is true if the DatadogMetric CR is valid, false otherwise |
datadog.cluster_agent.external_metrics.delay_seconds (gauge) | Freshness of the metric evaluated from querying Datadog Shown as second |
datadog.cluster_agent.external_metrics.processed_value (gauge) | Value processed from querying Datadog by metric |
datadog.cluster_agent.go.goroutines (gauge) | Number of goroutines that currently exist |
datadog.cluster_agent.go.memstats.alloc_bytes (gauge) | Number of bytes allocated and still in use Shown as byte |
datadog.cluster_agent.go.threads (gauge) | Number of OS threads created Shown as thread |
datadog.cluster_agent.kubernetes_apiserver.emitted_events (count) | Datadog events emitted by the kubernetes_apiserver check |
datadog.cluster_agent.kubernetes_apiserver.kube_events (count) | Kubernetes events processed by the kubernetes_apiserver check |
datadog.cluster_agent.language_detection_dca_handler.processed_requests (count) | The number of process language detection requests processed by the handler |
datadog.cluster_agent.language_detection_patcher.patches (count) | The number of patch requests sent by the patcher to the kube api server |
datadog.cluster_agent.secret_backend.elapsed (gauge) | The elapsed time of secret backend invocation Shown as millisecond |
datadog.cluster_agent.tagger.stored_entities (gauge) | Number of entities stored in the tagger |
datadog.cluster_agent.tagger.updated_entities (count) | Number of updates made to entities in the tagger |
datadog.cluster_agent.workloadmeta.events_received (count) | Number of events received by workloadmeta |
datadog.cluster_agent.workloadmeta.notifications_sent (count) | Number of notifications sent by workloadmeta to its subscribers |
datadog.cluster_agent.workloadmeta.stored_entities (gauge) | Number of entities stored in workloadmeta |
datadog.cluster_agent.workloadmeta.subscribers (gauge) | Number of workloadmeta subscribers |
Datadog-Cluster-Agent 통합은 이벤트를 포함하지 않습니다.
datadog.cluster_agent.prometheus.health
Returns CRITICAL
if the check cannot access the metrics endpoint. Returns OK
otherwise.
Statuses: ok, critical
도움이 필요하신가요? Datadog 지원팀에 문의하세요.