Supported OS Linux Mac OS Windows

インテグレーションバージョン3.2.0

概要

このチェックは、Datadog Agent を通じて Datadog Cluster Agent を監視します。

セットアップ

ホストで実行されている Agent 用にこのチェックをインストールおよび構成する場合は、以下の手順に従ってください。コンテナ環境の場合は、オートディスカバリーのインテグレーションテンプレートのガイドを参照してこの手順を行ってください。

インストール

Datadog-Cluster-Agent チェックは Datadog Agent パッケージに含まれています。 サーバーに追加でインストールする必要はありません。

構成

  1. datadog_cluster_agent のパフォーマンスデータの収集を開始するには、Agent のコンフィギュレーションディレクトリのルートにある conf.d/ フォルダーの datadog_cluster_agent.d/conf.yaml ファイルを編集します。使用可能なすべてのコンフィギュレーションオプションについては、サンプル datadog_cluster_agent.d/conf.yaml を参照してください。

  2. Agent を再起動します

検証

Agent の status サブコマンドを実行し、Checks セクションで datadog_cluster_agent を探します。

収集データ

メトリクス

datadog.cluster_agent.admission_webhooks.certificate_expiry
(gauge)
Time left before the certificate expires
Shown as hour
datadog.cluster_agent.admission_webhooks.cws_exec_instrumentation_attempts.count
(count)
CWS exec Instrumentation attempts count
datadog.cluster_agent.admission_webhooks.cws_exec_instrumentation_attempts.sum
(count)
CWS exec Instrumentation attempts sum
datadog.cluster_agent.admission_webhooks.cws_pod_instrumentation_attempts.count
(count)
CWS pod Instrumentation attempts count
datadog.cluster_agent.admission_webhooks.cws_pod_instrumentation_attempts.sum
(count)
CWS pod Instrumentation attempts sum
datadog.cluster_agent.admission_webhooks.library_injection_attempts
(count)
Number of library injection attempts by language
datadog.cluster_agent.admission_webhooks.library_injection_errors
(count)
Number of library injection failures by language
datadog.cluster_agent.admission_webhooks.mutation_attempts
(gauge)
Number of pod mutation attempts by mutation type
datadog.cluster_agent.admission_webhooks.mutation_errors
(gauge)
Number of mutation failures by mutation type
datadog.cluster_agent.admission_webhooks.patcher.attempts
(count)
Number of patch attempts
datadog.cluster_agent.admission_webhooks.patcher.completed
(count)
Number of completed patch attempts
datadog.cluster_agent.admission_webhooks.patcher.errors
(count)
Number of patch errors
datadog.cluster_agent.admission_webhooks.rc_provider.configs
(gauge)
Number of valid remote configuration
datadog.cluster_agent.admission_webhooks.rc_provider.invalid_configs
(gauge)
Number of invalid remote configurations
datadog.cluster_agent.admission_webhooks.reconcile_errors
(gauge)
Number of reconcile errors per controller
datadog.cluster_agent.admission_webhooks.reconcile_success
(gauge)
Number of reconcile successes per controller
Shown as success
datadog.cluster_agent.admission_webhooks.response_duration.count
(count)
Webhook response duration count
datadog.cluster_agent.admission_webhooks.response_duration.sum
(count)
Webhook response duration sum
Shown as second
datadog.cluster_agent.admission_webhooks.webhooks_received
(gauge)
Number of mutation webhook requests received
datadog.cluster_agent.aggregator.flush
(count)
Number of metrics/service checks/events flushed by (data_type, state)
datadog.cluster_agent.aggregator.processed
(count)
Amount of metrics/serviceschecks/events processed by the aggregator by datatype
datadog.cluster_agent.api_requests
(count)
Requests made to the cluster agent API by (handler, status)
Shown as request
datadog.cluster_agent.autodiscovery.errors
(gauge)
Number of Autodiscovery errors
datadog.cluster_agent.autodiscovery.poll_duration.count
(count)
Autodiscovery poll duration count
datadog.cluster_agent.autodiscovery.poll_duration.sum
(count)
Autodiscovery poll duration sum
Shown as second
datadog.cluster_agent.autodiscovery.watched_resources
(gauge)
Number of watched resources (Services and Endpoints)
datadog.cluster_agent.cluster_checks.busyness
(gauge)
Busyness of a node per the number of metrics submitted and average duration of all checks run
datadog.cluster_agent.cluster_checks.configs_dangling
(gauge)
Number of check configurations not dispatched
datadog.cluster_agent.cluster_checks.configs_dispatched
(gauge)
Number of check configurations dispatched by node
datadog.cluster_agent.cluster_checks.configs_info
(gauge)
Information about check configurations dispatched (node and check ID)
datadog.cluster_agent.cluster_checks.failed_stats_collection
(count)
Total number of unsuccessful stats collection attempts
datadog.cluster_agent.cluster_checks.nodes_reporting
(gauge)
Number of node agents reporting
datadog.cluster_agent.cluster_checks.rebalancing_decisions
(count)
Total number of check rebalancing decisions
datadog.cluster_agent.cluster_checks.rebalancing_duration_seconds
(gauge)
Duration of the check rebalancing algorithm last execution
Shown as second
datadog.cluster_agent.cluster_checks.successful_rebalancing_moves
(count)
Total number of successful check rebalancing decisions
Shown as check
datadog.cluster_agent.cluster_checks.updating_stats_duration_seconds
(gauge)
Duration of collecting stats from check runners and updating cache
Shown as second
datadog.cluster_agent.datadog.rate_limit_queries.limit
(gauge)
Maximum number of queries to the Datadog API allowed in the period by endpoint
Shown as query
datadog.cluster_agent.datadog.rate_limit_queries.period
(gauge)
Period of rate limiting for the Datadog API by endpoint
Shown as second
datadog.cluster_agent.datadog.rate_limit_queries.remaining
(gauge)
Number of queries to the Datadog API remaining before next reset by endpoint
Shown as query
datadog.cluster_agent.datadog.rate_limit_queries.remaining_min
(gauge)
Minimum number of queries remaining before next reset observed during an expiration interval of 2*refresh period
Shown as query
datadog.cluster_agent.datadog.rate_limit_queries.reset
(gauge)
Number of seconds before next reset applied to the Datadog API by endpoint
Shown as second
datadog.cluster_agent.datadog.requests
(count)
Requests made to Datadog by status
Shown as request
datadog.cluster_agent.endpoint_checks.configs_dispatched
(gauge)
Number of endpoint-check configurations dispatched by node
datadog.cluster_agent.external_metrics
(gauge)
Number of external metrics tagged
datadog.cluster_agent.external_metrics.api_elapsed.count
(count)
Count of API Requests received
datadog.cluster_agent.external_metrics.api_elapsed.sum
(count)
Count of API Requests received
datadog.cluster_agent.external_metrics.api_requests
(gauge)
Count of API Requests received
datadog.cluster_agent.external_metrics.datadog_metrics
(gauge)
The label valid is true if the DatadogMetric CR is valid, false otherwise
datadog.cluster_agent.external_metrics.delay_seconds
(gauge)
Freshness of the metric evaluated from querying Datadog
Shown as second
datadog.cluster_agent.external_metrics.processed_value
(gauge)
Value processed from querying Datadog by metric
datadog.cluster_agent.go.goroutines
(gauge)
Number of goroutines that currently exist
datadog.cluster_agent.go.memstats.alloc_bytes
(gauge)
Number of bytes allocated and still in use
Shown as byte
datadog.cluster_agent.go.threads
(gauge)
Number of OS threads created
Shown as thread
datadog.cluster_agent.kubernetes_apiserver.emitted_events
(count)
Datadog events emitted by the kubernetes_apiserver check
datadog.cluster_agent.kubernetes_apiserver.kube_events
(count)
Kubernetes events processed by the kubernetes_apiserver check
datadog.cluster_agent.language_detection_dca_handler.processed_requests
(count)
The number of process language detection requests processed by the handler
datadog.cluster_agent.language_detection_patcher.patches
(count)
The number of patch requests sent by the patcher to the kube api server
datadog.cluster_agent.secret_backend.elapsed
(gauge)
The elapsed time of secret backend invocation
Shown as millisecond
datadog.cluster_agent.tagger.stored_entities
(gauge)
Number of entities stored in the tagger
datadog.cluster_agent.tagger.updated_entities
(count)
Number of updates made to entities in the tagger
datadog.cluster_agent.workloadmeta.events_received
(count)
Number of events received by workloadmeta
datadog.cluster_agent.workloadmeta.notifications_sent
(count)
Number of notifications sent by workloadmeta to its subscribers
datadog.cluster_agent.workloadmeta.stored_entities
(gauge)
Number of entities stored in workloadmeta
datadog.cluster_agent.workloadmeta.subscribers
(gauge)
Number of workloadmeta subscribers

イベント

Datadog_Cluster_Agent インテグレーションには、イベントは含まれません。

サービスチェック

datadog.cluster_agent.prometheus.health
Returns CRITICAL if the check cannot access the metrics endpoint. Returns OK otherwise.
Statuses: ok, critical

トラブルシューティング

ご不明な点は、Datadog のサポートチームまでお問合せください。