Supported OS Linux Mac OS Windows

インテグレーションバージョン3.0.0

概要

このチェックは、Datadog Agent を通じて Cilium を監視します。このインテグレーションにより、cilium-agent または cilium-operator からメトリクスを収集できます。

セットアップ

ホストで実行されている Agent 用にこのチェックをインストールおよび構成する場合は、以下の手順に従ってください。コンテナ環境の場合は、オートディスカバリーのインテグレーションテンプレートのガイドを参照してこの手順を行ってください。

インストール

Cilium チェックは Datadog Agent パッケージに含まれていますが、Prometheus のメトリクスを公開するための追加のセットアップが必要です。

  1. cilium-agentcilium-operator の両方で Prometheus のメトリクスを有効にするには、Cilium のバージョンに応じて以下の Helm の値を設定した状態で Cilium をデプロイします。
    • Cilium < v1.8.x: global.prometheus.enabled=true
    • Cilium >= v1.8.x および < v1.9.x: global.prometheus.enabled=true および global.operatorPrometheus.enabled=true
    • Cilium >= 1.9.x: prometheus.enabled=true および operator.prometheus.enabled=true

または、別途 Kubernetes のマニフェストで Prometheus のメトリクスを有効にします。

Cilium <= v1.11 の場合は、--prometheus-serve-addr=:9090 を使用してください。
  • cilium-agent で、Cilium DaemonSet 構成の args セクションに --prometheus-serve-addr=:9962 を追加します。

    # [...]
    spec:
      containers:
        - args:
            - --prometheus-serve-addr=:9962
    
  • cilium-operator で、Cilium デプロイ構成の args セクションに --enable-metrics を追加します。

    # [...]
    spec:
      containers:
        - args:
            - --enable-metrics
    

構成

ホスト

ホストで実行中の Agent に対してこのチェックを構成するには

  1. Agent の構成ディレクトリのルートにある conf.d/ フォルダーの cilium.d/conf.yaml ファイルを編集し、Cilium のパフォーマンスデータを収集します。使用可能なすべての構成オプションについては、cilium.d/conf.yaml のサンプルを参照してください。

    • cilium-agent メトリクスを収集するには、agent_endpoint オプションを有効にします。
    • cilium-operator メトリクスを収集するには、operator_endpoint オプションを有効にします。
        instances:
    
            ## @param use_openmetrics - boolean - optional - default: false
            ## Use the latest OpenMetrics V2 implementation for more features and better performance.
            ##
            ## Note: To see the configuration options for the legacy OpenMetrics implementation (Agent 7.33 or older),
            ## https://github.com/DataDog/integrations-core/blob/7.33.x/cilium/datadog_checks/cilium/data/conf.yaml.example
            #
          - use_openmetrics: true # Enables OpenMetrics V2
    
            ## @param agent_endpoint - string - optional
            ## The URL where your application metrics are exposed by Prometheus.
            ## By default, the Cilium integration collects `cilium-agent` metrics.
            ## One of agent_endpoint or operator_endpoint must be provided.
            #
            agent_endpoint: http://localhost:9090/metrics
    
            ## @param operator_endpoint - string - optional
            ## Provide instead of `agent_endpoint` to collect `cilium-operator` metrics.
            ## Cilium operator metrics are exposed on port 6942.
            #
            operator_endpoint: http://localhost:6942/metrics
    

: デフォルトでは、conf.yaml.example の use_openmetrics オプションが有効になっています。OpenMetrics V1 の実装を使用する場合は、use_openmetrics 構成オプションを false に設定します。OpenMetrics V1 の構成パラメーターを見るには、conf.yaml.example ファイルを参照してください。

  1. Agent を再起動します
ログの収集

Cilium には cilium-agentcilium-operator の 2 種類のログがあります。

  1. Datadog Agent で、ログの収集はデフォルトで無効になっています。以下のように、DaemonSet 構成でこれを有効にします。

      # (...)
        env:
        #  (...)
          - name: DD_LOGS_ENABLED
              value: "true"
          - name: DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL
              value: "true"
      # (...)
    
  2. Datadog Agent への Docker ソケットをマニフェストでマウントするか、Docker を使用していない場合は、/var/log/pods ディレクトリをマウントします。マニフェストの例については、DaemonSet の Kubernetes インストール手順を参照してください。

  3. Agent を再起動します

コンテナ化

コンテナ環境の場合は、オートディスカバリーのインテグレーションテンプレートのガイドを参照して、次のパラメーターを適用してください。

Datadog Agent で、ログの収集はデフォルトで無効になっています。有効にする方法については、Kubernetes ログ収集を参照してください。

cilium-agent のメトリクスとログを収集するには
  • メトリクスの収集
パラメーター
<INTEGRATION_NAME>"cilium"
<INIT_CONFIG>空白または {}
<INSTANCE_CONFIG>{"agent_endpoint": "http://%%host%%:9090/metrics", "use_openmetrics": "true"}
  • ログの収集
パラメーター
<LOG_CONFIG>{"source": "cilium-agent", "service": "cilium-agent"}
cilium-operator のメトリクスとログを収集するには
  • メトリクスの収集
パラメーター
<INTEGRATION_NAME>"cilium"
<INIT_CONFIG>空白または {}
<INSTANCE_CONFIG>{"operator_endpoint": "http://%%host%%:6942/metrics", "use_openmetrics": "true"}
  • ログの収集
パラメーター
<LOG_CONFIG>{"source": "cilium-operator", "service": "cilium-operator"}

検証

Agent の status サブコマンドを実行し、Checks セクションで cilium を探します。

収集データ

メトリクス

cilium.agent.api_process_time.seconds.count
(count)
[OpenMetrics V1 and V2] Count of processing time for all API calls
Shown as request
cilium.agent.api_process_time.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of processing time for all API calls
Shown as second
cilium.agent.api_process_time.seconds.bucket
(count)
[OpenMetrics V2] Amount of processing time for all API calls
Shown as second
cilium.agent.bootstrap.seconds.count
(count)
[OpenMetrics V1 and V2] Count of bootstrap durations
cilium.agent.bootstrap.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of bootstrap durations
Shown as second
cilium.agent.bootstrap.seconds.bucket
(count)
[OpenMetrics V2] Sample of bootstrap durations
Shown as second
cilium.api_limiter.adjustment_factor
(gauge)
[OpenMetrics V1 and V2] Most recent adjustment factor for automatic adjustment
Shown as second
cilium.api_limiter.processed_requests.total
(count)
[OpenMetrics V1] Total number of API requests processed
Shown as request
cilium.api_limiter.processed_requests.count
(count)
[OpenMetrics V2] Total number of API requests processed
Shown as request
cilium.api_limiter.processing_duration.seconds
(gauge)
[OpenMetrics V1 and V2] Mean and estimated processing duration in seconds
Shown as second
cilium.api_limiter.rate_limit
(gauge)
[OpenMetrics V1 and V2] Current rate limiting configuration (limit and burst)
Shown as request
cilium.api_limiter.requests_in_flight
(gauge)
[OpenMetrics V1 and V2] Current and maximum allowed number of requests in flight.
Shown as request
cilium.api_limiter.wait_duration.seconds
(gauge)
[OpenMetrics V1 and V2] Wait duration aggregated per api call
Shown as second
cilium.bpf.map_ops.total
(count)
[OpenMetrics V1] Total BPF map operations performed
Shown as operation
cilium.bpf.map_ops.count
(count)
[OpenMetrics V2] Total BPF map operations performed
Shown as operation
cilium.bpf.map_pressure
(gauge)
[OpenMetrics V1 and V2] Map pressure defined as a ratio of the map usage compared to it
cilium.bpf.maps.virtual_memory.max.bytes
(gauge)
[OpenMetrics V1 and V2] Max memory used by eBPF maps installed in the system
Shown as byte
cilium.bpf.progs.virtual_memory.max.bytes
(gauge)
[OpenMetrics V1 and V2] Max memory used by eBPF programs installed in the system
Shown as byte
cilium.controllers.failing.count
(gauge)
[OpenMetrics V1 and V2] Number of failing controllers
Shown as error
cilium.controllers.runs_duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of controller processes duration
Shown as operation
cilium.controllers.runs_duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of controller processes duration
Shown as second
cilium.controllers.runs_duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of controller processes duration
Shown as second
cilium.controllers.runs.total
(count)
[OpenMetrics V1] Total number of controller runs
Shown as event
cilium.controllers.runs.count
(count)
[OpenMetrics V2] Total number of controller runs
Shown as event
cilium.datapath.conntrack_gc.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of garbage collector process duration
Shown as operation
cilium.datapath.conntrack_gc.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of garbage collector process duration
Shown as second
cilium.datapath.conntrack_gc.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of garbage collector process duration
Shown as second
cilium.datapath.conntrack_gc.entries
(gauge)
[OpenMetrics V1 and V2] The number of alive and deleted conntrack entries
Shown as garbage collection
cilium.datapath.conntrack_gc.key_fallbacks.total
(count)
[OpenMetrics V1] The total number of conntrack entries
Shown as garbage collection
cilium.datapath.conntrack_gc.key_fallbacks.count
(count)
[OpenMetrics V2] The total number of conntrack entries.
Shown as garbage collection
cilium.datapath.conntrack_gc.runs.total
(count)
[OpenMetrics V1] Total number of the conntrack garbage collector process runs
Shown as garbage collection
cilium.datapath.conntrack_gc.runs.count
(count)
[OpenMetrics V2] Total number of the conntrack garbage collector process runs
Shown as garbage collection
cilium.datapath.conntrack_dump.resets.total
(count)
[OpenMetrics V1] Number of conntrack dump resets. Happens when a BPF entry gets removed while dumping the map is in progress
cilium.datapath.conntrack_dump.resets.count
(count)
[OpenMetrics V2] Number of conntrack dump resets. Happens when a BPF entry gets removed while dumping the map is in progress
cilium.drop_bytes.total
(count)
[OpenMetrics V1] Total dropped bytes
Shown as byte
cilium.drop_bytes.count
(count)
[OpenMetrics V2] Total dropped bytes
Shown as byte
cilium.drop_count.total
(count)
[OpenMetrics V1] Total dropped packets
Shown as packet
cilium.drop_count.count
(count)
[OpenMetrics V2] Total dropped packets
Shown as packet
cilium.endpoint.regeneration_time_stats.seconds.count
(count)
[OpenMetrics V1 and V2] Count of endpoint regeneration time stats
Shown as operation
cilium.endpoint.regeneration_time_stats.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of endpoint regeneration time stats
Shown as second
cilium.endpoint.regeneration_time_stats.seconds.bucket
(count)
[OpenMetrics V2] Sample of endpoint regeneration time stats
Shown as second
cilium.endpoint.regenerations.total
(count)
[OpenMetrics V1] Count of completed endpoint regenerations
Shown as unit
cilium.endpoint.regenerations.count
(count)
[OpenMetrics V2] Count of completed endpoint regenerations
Shown as unit
cilium.endpoint.state
(gauge)
[OpenMetrics V1 and V2] Count of all endpoints
Shown as unit
cilium.errors_warning.total
(count)
[OpenMetrics V1] Total error warnings
Shown as error
cilium.errors_warning.count
(count)
[OpenMetrics V2] Total error warnings
Shown as error
cilium.event_timestamp
(gauge)
[OpenMetrics V1 and V2] Last timestamp of event received
Shown as time
cilium.forward_bytes.total
(count)
[OpenMetrics V1] Total forwarded bytes
Shown as byte
cilium.forward_bytes.count
(count)
[OpenMetrics V2] Total forwarded bytes
Shown as byte
cilium.forward_count.total
(count)
[OpenMetrics V1] Total forwarded packets
Shown as packet
cilium.forward_count.count
(count)
[OpenMetrics V2] Total forwarded packets
Shown as packet
cilium.fqdn.gc_deletions.total
(count)
[OpenMetrics V1] Total number of FQDNs cleaned in FQDN garbage collector job.
Shown as event
cilium.fqdn.gc_deletions.count
(count)
[OpenMetrics V2] Total number of FQDNs cleaned in FQDN garbage collector job.
Shown as event
cilium.fqdn.active_names
(gauge)
Number of domains inside the DNS cache that have not expired (by TTL), per endpoint. Available in Cilium v1.12+
Shown as item
cilium.fqdn.active_ips
(gauge)
Number of IPs inside the DNS cache associated with a domain that has not expired (by TTL), per endpoint. Available in Cilium v1.12+
Shown as item
cilium.fqdn.alive_zombie_connections
(gauge)
Number of IPs associated with domains that have expired (by TTL) yet still associated with an active connection (aka zombie), per endpoint. Available in Cilium v1.12+
Shown as item
cilium.ipcache.errors.total
(count)
[OpenMetrics V1] Number of errors interacting with the ipcache
Shown as error
cilium.ipcache.errors.count
(count)
[OpenMetrics V2] Number of errors interacting with the ipcache
Shown as error
cilium.ip_addresses.count
(gauge)
[OpenMetrics V1 and V2] Number of allocated ip_addresses
Shown as unit
cilium.ipam.events.total
(count)
[OpenMetrics V1] Number of IPAM events received by action and datapath family type
Shown as event
cilium.ipam.events.count
(count)
[OpenMetrics V2] Number of IPAM events received by action and datapath family type
Shown as event
cilium.k8s_client.api_latency_time.seconds.count
(count)
[OpenMetrics V1 and V2] Count of processed API call duration
Shown as request
cilium.k8s_client.api_latency_time.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of processed API call duration
Shown as second
cilium.k8s_client.api_latency_time.seconds.bucket
(count)
[OpenMetrics V2] Sample of processed API call duration
Shown as second
cilium.k8s_event.lag.seconds
(gauge)
[OpenMetrics V1 and v2] Lag for Kubernetes events - computed value between receiving a CNI ADD event from kubelet and a Pod event received from kube-api-server
Shown as second
cilium.k8s_terminating.endpoints_events.total
(count)
[OpenMetrics V1] Number of terminating endpoint events received from Kubernetes
Shown as event
cilium.k8s_terminating.endpoints_events.count
(count)
[OpenMetrics V2] Number of terminating endpoint events received from Kubernetes
Shown as event
cilium.kubernetes.events_received.total
(count)
[OpenMetrics V1] Number of Kubernetes received events processed
Shown as event
cilium.kubernetes.events_received.count
(count)
[OpenMetrics V2] Number of Kubernetes received events processed
Shown as event
cilium.kubernetes.events.total
(count)
[OpenMetrics V1] Number of Kubernetes events processed
Shown as event
cilium.kubernetes.events.count
(count)
[OpenMetrics V2] Number of Kubernetes events processed
Shown as event
cilium.nodes.all_datapath_validations.total
(count)
[OpenMetrics V1] Number of validation calls to implement the datapath implementation of a node
Shown as unit
cilium.nodes.all_datapath_validations.count
(count)
[OpenMetrics V2] Number of validation calls to implement the datapath implementation of a node
Shown as unit
cilium.nodes.all_events_received.total
(count)
[OpenMetrics V1] Number of node events received.
Shown as event
cilium.nodes.all_events_received.count
(count)
[OpenMetrics V2] Number of node events received
Shown as event
cilium.nodes.managed.total
(gauge)
[OpenMetrics V1 and V2] Number of nodes managed
Shown as node
cilium.endpoint.count
(gauge)
[OpenMetrics V1 and V2] Total ready endpoints managed by agent
Shown as unit
cilium.identity.count
(gauge)
[OpenMetrics V1 and V2] Number of identities allocate.
Shown as unit
cilium.policy.count
(gauge)
[OpenMetrics V1 and V2] Number of policies currently loaded
Shown as unit
cilium.policy.implementation_delay.bucket
(count)
[OpenMetrics V1 and V2] Time between a policy change and it being fully deployed into the datapath
Shown as second
cilium.policy.implementation_delay.count
(count)
[OpenMetrics V1 and V2] Time between a policy change and it being fully deployed into the datapath
Shown as unit
cilium.policy.implementation_delay.sum
(count)
[OpenMetrics V1 and V2] Time between a policy change and it being fully deployed into the datapath
Shown as second
cilium.policy.import_errors.count
(count)
[OpenMetrics V1 and V2] Number of failed policy imports
Shown as error
cilium.policy.endpoint_enforcement_status
(gauge)
[OpenMetrics V1 and V2] Number of endpoints labeled by policy enforcement status
Shown as unit
cilium.policy.max_revision
(gauge)
[OpenMetrics V1 and V2] Highest policy revision number in the agent
Shown as unit
cilium.policy.regeneration_time_stats.seconds.count
(count)
[OpenMetrics V1 and V2] Policy regeneration time stats count
Shown as operation
cilium.policy.regeneration_time_stats.seconds.sum
(count)
[OpenMetrics V1 and V2] Policy regeneration time stats count
Shown as second
cilium.policy.regeneration_time_stats.seconds.bucket
(count)
[OpenMetrics V2] Policy regeneration time stats sample
Shown as second
cilium.policy.regeneration.total
(count)
[OpenMetrics V1] Total number of successful policy regenerations
Shown as unit
cilium.policy.regeneration.count
(count)
[OpenMetrics V2] Total number of successful policy regenerations
Shown as unit
cilium.process.cpu.seconds.total
(gauge)
[OpenMetrics V1] Process CPU time in seconds
Shown as second
cilium.process.cpu.seconds.count
(count)
[OpenMetrics V2] Process CPU time in seconds
Shown as second
cilium.process.max_fds
(gauge)
[OpenMetrics V1 and V2] Process file descriptor maximum
Shown as file
cilium.process.open_fds
(gauge)
[OpenMetrics V1 and V2] Number of open file descriptors
Shown as file
cilium.process.resident_memory.bytes
(gauge)
[OpenMetrics V1 and V2] Total resident memory bytes
Shown as byte
cilium.process.start_time.seconds
(gauge)
[OpenMetrics V1 and V2] Processes start time
Shown as second
cilium.process.virtual_memory.bytes
(gauge)
[OpenMetrics V1 and V2] Virtual memory bytes
Shown as byte
cilium.process.virtual_memory.max.bytes
(gauge)
[OpenMetrics V1 and V2] Maximum virtual memory bytes
Shown as byte
cilium.subprocess.start.total
(count)
[OpenMetrics V1] Number of times that Cilium has started a subprocess
Shown as unit
cilium.subprocess.start.count
(count)
[OpenMetrics V2] Number of times that Cilium has started a subprocess
Shown as unit
cilium.triggers_policy.update_call_duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of policy update trigger duration
Shown as operation
cilium.triggers_policy.update_call_duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of policy update trigger duration
Shown as second
cilium.triggers_policy.update_call_duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of policy update trigger duration
Shown as second
cilium.triggers_policy.update_folds
(gauge)
[OpenMetrics V1 and V2] Number of folds
Shown as unit
cilium.triggers_policy.update.total
(count)
[OpenMetrics V1] Total number of policy update trigger invocations
Shown as unit
cilium.triggers_policy.update.count
(count)
[OpenMetrics V2] Total number of policy update trigger invocations
Shown as unit
cilium.unreachable.health_endpoints
(gauge)
[OpenMetrics V1 and V2] Number of health endpoints that cannot be reached
Shown as unit
cilium.unreachable.nodes
(gauge)
[OpenMetrics V1 and V2] Number of nodes that cannot be reached
Shown as node
cilium.kvstore.operations_duration.seconds.count
(count)
[OpenMetrics V1 and V2] Duration of kvstore operation count
Shown as operation
cilium.kvstore.operations_duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Duration of kvstore operation sum
Shown as second
cilium.kvstore.operations_duration.seconds.bucket
(count)
[OpenMetrics V2] Duration of kvstore operation sample
Shown as second
cilium.kvstore.events_queue.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration in seconds of received event was blocked before it could be queued
cilium.kvstore.events_queue.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration in seconds received event was blocked before it could be queued
Shown as second
cilium.kvstore.events_queue.seconds.bucket
(count)
[OpenMetrics V2] Sum of duration in seconds received event was blocked before it could be queued
Shown as second
cilium.kvstore.quorum_errors.total
(count)
[OpenMetrics V1] Number of quorum errors
Shown as error
cilium.kvstore.quorum_errors.count
(count)
[OpenMetrics V2] Number of quorum errors
Shown as error
cilium.kvstore.sync_queue_size
(gauge)
Number of elements queued for synchronization in the kvstore
Shown as item
cilium.kvstore.initial_sync_completed
(gauge)
Whether the initial synchronization from/to the kvstore has completed
cilium.proxy.redirects
(gauge)
Number of redirects installed for endpoints by protocol
cilium.proxy.upstream_reply.seconds.count
(count)
[OpenMetrics V1 and V2] Seconds waited for upstream server to reply to a request labeled by error, protocol and span time
Shown as second
cilium.proxy.upstream_reply.seconds.sum
(count)
[OpenMetrics V1 and V2] Seconds waited for upstream server to reply to a request labeled by error, protocol and span time
Shown as second
cilium.proxy.upstream_reply.seconds.bucket
(count)
[OpenMetrics V2] Seconds waited for upstream server to reply to a request labeled by error, protocol and span time
Shown as second
cilium.proxy.datapath.update_timeout.total
(count)
[OpenMetrics V1] Number of total datapath update timeouts due to FQDN IP updates. Available in Cilium 1.10+
Shown as timeout
cilium.proxy.datapath.update_timeout.count
(count)
[OpenMetrics V2] Number of total datapath update timeouts due to FQDN IP updates. Available in Cilium 1.10+
Shown as timeout
cilium.policy.l7.total
(count)
[OpenMetrics V1] Number of total L7 requests/responses by type
Shown as unit
cilium.policy.l7.count
(count)
[OpenMetrics V2] Number of total L7 requests/responses by type
Shown as unit
cilium.policy.l7_denied.total
(count)
[OpenMetrics V1] Number of total L7 denied requests/responses due to policy. Available in Cilium <= v1.7
Shown as unit
cilium.policy.l7_denied.count
(count)
[OpenMetrics V2] Number of total L7 denied requests/responses due to policy. Available in Cilium <= v1.7
Shown as unit
cilium.policy.l7_forwarded.total
(count)
[OpenMetrics V1] Number of total L7 forwarded requests/responses. Available in Cilium <= v1.7
Shown as unit
cilium.policy.l7_forwarded.count
(count)
[OpenMetrics V2] Number of total L7 forwarded requests/response. Available in Cilium <= v1.7
Shown as unit
cilium.policy.l7_parse_errors.total
(count)
[OpenMetrics V1] Number of total L7 parse errors. Available in Cilium <= v1.7
Shown as error
cilium.policy.l7_parse_errors.count
(count)
[OpenMetrics V2] Number of total L7 parse errors. Available in Cilium <= v1.7
Shown as error
cilium.policy.l7_received.total
(count)
[OpenMetrics V1] Number of total L7 received requests/responses. Available in Cilium <= v1.7
Shown as unit
cilium.policy.l7_received.count
(count)
[OpenMetrics V2] Number of total L7 received requests/responses. Available in Cilium <= v1.7
Shown as unit
cilium.datapath.errors.total
(count)
[OpenMetrics V1] Total number of errors in datapath management. Available in Cilium <= v1.9
Shown as error
cilium.datapath.errors.count
(count)
[OpenMetrics V2] Total number of errors in datapath management. Available in Cilium <= v1.9
Shown as error
cilium.k8s_client.api_calls.count
(count)
[OpenMetrics V1 and V2] Number of API calls made to kube-apiserver. Available in Cilium v1.10+
Shown as request
cilium.operator.process.cpu.seconds
(count)
[OpenMetrics V1] Total user and system CPU time spent in seconds
Shown as second
cilium.operator.process.cpu.seconds.count
(count)
[OpenMetrics V2] Total user and system CPU time spent in seconds
Shown as second
cilium.operator.process.max_fds
(gauge)
[OpenMetrics V1 and V2] Maximum number of open file descriptors
Shown as file
cilium.operator.process.open_fds
(gauge)
[OpenMetrics V1 and V2] Number of open file descriptors
Shown as file
cilium.operator.process.resident_memory.bytes
(gauge)
[OpenMetrics V1 and V2] Resident memory size in bytes
Shown as byte
cilium.operator.process.start_time.seconds
(gauge)
[OpenMetrics V1 and V2] Start time of the process since unix epoch in seconds
Shown as second
cilium.operator.process.virtual_memory.bytes
(gauge)
[OpenMetrics V1 and V2] Virtual memory size in bytes
Shown as byte
cilium.operator.process.virtual_memory_max.bytes
(gauge)
[OpenMetrics V1 and V2] Maximum amount of virtual memory available in bytes
Shown as byte
cilium.operator.eni.deficit_resolver.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of deficit resolver trigger runs
Shown as operation
cilium.operator.eni.deficit_resolver.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of deficit resolver trigger runs
Shown as second
cilium.operator.eni.deficit_resolver.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of deficit resolver trigger runs
Shown as second
cilium.operator.eni.deficit_resolver.folds
(gauge)
[OpenMetrics V1 and V2] Current level of deficit resolver folding
Shown as unit
cilium.operator.eni.deficit_resolver.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of latency between deficit resolver queue and trigger run
Shown as operation
cilium.operator.eni.deficit_resolver.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of latency between deficit resolver queue and trigger run
Shown as second
cilium.operator.eni.deficit_resolver.latency.seconds.bucket
(count)
[OpenMetrics V2] Sample of latency between deficit resolver queue and trigger run
Shown as second
cilium.operator.eni.deficit_resolver.queued.total
(gauge)
[OpenMetrics V1] Number of queued deficit resolver triggers
Shown as event
cilium.operator.eni.deficit_resolver.queued.count
(count)
[OpenMetrics V2] Number of queued deficit resolver triggers
Shown as event
cilium.operator.eni.available
(gauge)
[OpenMetrics V2] Number of available IPs per subnet ID. Available in Cilium <= v1.8
Shown as unit
cilium.operator.eni.available.ips_per_subnet
(gauge)
[OpenMetrics V1 and V2] Number of available IPs per subnet ID. Available in Cilium <= v1.8
Shown as unit
cilium.operator.eni.aws_api_duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of interactions with AWS API. Available in Cilium <= v1.8
Shown as request
cilium.operator.eni.aws_api_duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of interactions with AWS API. Available in Cilium <= v1.8
Shown as second
cilium.operator.eni.aws_api_duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of interactions with AWS API. Available in Cilium <= v1.8
Shown as second
cilium.operator.eni_ec2.rate_limit.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of client-side rate limiter blocking. Available in Cilium <= v1.9
Shown as request
cilium.operator.eni_ec2.rate_limit.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of client-side rate limiter blocking. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni_ec2.rate_limit.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of client-side rate limiter blocking. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.ec2_resync.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of ec2 resync trigger runs. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.ec2_resync.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of ec2 resync trigger runs. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.ec2_resync.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of ec2 resync trigger runs. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.ec2_resync.folds
(gauge)
[OpenMetrics V1 and V2] Current level of ec2 resync folding. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.ec2_resync.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of latency between ec2 resync queue and trigger run. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.ec2_resync.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of latency between ec2 resync queue and trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.ec2_resync.latency.seconds.bucket
(count)
[OpenMetrics V2] Sample of latency between ec2 resync queue and trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.ec2_resync.queued.total
(gauge)
[OpenMetrics V1] Number of queued ec2 resync triggers. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.ec2_resync.queued.count
(count)
[OpenMetrics V2] Number of queued ec2 resync triggers. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.interface_creation_ops
(count)
[OpenMetrics V1] Number of ENIs allocated. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.interface_creation_ops.count
(count)
[OpenMetrics V2] Number of ENIs allocated. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.ips.total
(gauge)
[OpenMetrics V1 and V2] Number of IPs allocated. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.k8s_sync.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of k8s sync trigger run. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.k8s_sync.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of k8s sync trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.k8s_sync.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of k8s sync trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.k8s_sync.folds
(gauge)
[OpenMetrics V1 and V2] Current level of k8s sync folding. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.k8s_sync.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of k8s sync latency between queue and trigger run. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.k8s_sync.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of k8s sync latency between queue and trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.k8s_sync.latency.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of k8s sync latency between queue and trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.k8s_sync.queued.total
(gauge)
[OpenMetrics V1] Number of queued k8s sync triggers. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.k8s_sync.queued.count
(count)
[OpenMetrics V2] Number of queued k8s sync triggers. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.nodes.total
(gauge)
[OpenMetrics V1] Number of nodes by category. Available in Cilium <= v1.9
Shown as node
cilium.operator.eni.resync.total
(count)
[OpenMetrics V1] Number of resync operations to synchronize AWS EC2 metadata. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.resync.count
(count)
[OpenMetrics V2] Number of resync operations to synchronize AWS EC2 metadata. Available in Cilium <= v1.9
Shown as unit
cilium.operator.ipam.ips
(gauge)
[OpenMetrics V1 and V2] Number of IPs allocated. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.allocation_ops
(count)
[OpenMetrics V1] Count of IP allocation operations. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.allocation_ops.count
(count)
[OpenMetrics V2] Count of IP allocation operations. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.release_ops
(count)
[OpenMetrics V1] Count of IP release operations. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.release_ops.count
(count)
[OpenMetrics V2] Count of IP release operations. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.interface_creation_ops
(count)
[OpenMetrics V1] Count of interfaces allocated. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.interface_creation_ops.count
(count)
[OpenMetrics V2] Count of interfaces allocated. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.available
(gauge)
[OpenMetrics V1 and V2] Number of interfaces with addresses available. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.available.ips_per_subnet
(gauge)
[OpenMetrics V1 and V2] Number of available IPs per subnet ID. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.deficit_resolver.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of deficit resolver trigger runs. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.deficit_resolver.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of deficit resolver trigger runs. Available in Cilium v1.8+
Shown as request
cilium.operator.ipam.deficit_resolver.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of deficit resolver trigger runs. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.deficit_resolver.queued.count
(count)
[OpenMetrics V2] Number of queued triggers. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.deficit_resolver.queued.total
(count)
[OpenMetrics V1] Number of queued triggers. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.deficit_resolver.folds
(gauge)
[OpenMetrics V1 and V2] Current level of deficit resolver folding. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.deficit_resolver.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of deficit resolver latency between queue and trigger run. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.deficit_resolver.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of deficit resolver latency between queue and trigger run. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.deficit_resolver.latency.seconds.bucket
(count)
[OpenMetrics V2] Sample of deficit resolver latency between queue and trigger run. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.k8s_sync.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of K8s sync trigger runs. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.k8s_sync.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of K8s sync trigger runs. Available in Cilium v1.8+
Shown as request
cilium.operator.ipam.k8s_sync.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of K8s sync trigger runs. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.k8s_sync.queued.total
(count)
[OpenMetrics V1] Number of queued k8s sync triggers. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.k8s_sync.queued.count
(count)
[OpenMetrics V2] Number of queued k8s sync triggers. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.k8s_sync.folds
(gauge)
[OpenMetrics V1 and V2] Current level of K8s sync folding. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.k8s_sync.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of K8s sync latency between queue and trigger run. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.k8s_sync.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of K8s sync latency between queue and trigger run. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.k8s_sync.latency.seconds.bucket
(count)
[OpenMetrics V2] Count of K8s sync latency between queue and trigger run. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.nodes
(gauge)
[OpenMetrics V1 and V2] Number of nodes by category. Available in Cilium v1.8+
Shown as node
cilium.operator.ipam.api.duration.seconds.sum
(count)
[OpenMetricsV1 and V2] Duration of interactions with external IPAM API. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.api.duration.seconds.count
(count)
[OpenMetricsV1 and V2] Duration of interactions with external IPAM API. Available in Cilium v1.9+
Shown as request
cilium.operator.ipam.api.duration.seconds.bucket
(count)
[OpenMetricsV2] Duration of interactions with external IPAM API. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.api.rate_limit.duration.seconds.sum
(count)
[OpenMetricsV1 and V2] Sum of duration of rate limiting while accessing external IPAM API. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.api.rate_limit.duration.seconds.count
(count)
[OpenMetricsV1 and V2] Count of duration of rate limiting while accessing external IPAM API. Available in Cilium v1.9+
Shown as request
cilium.operator.ipam.api.rate_limit.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of rate limiting while accessing external IPAM API. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.resync.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of resync trigger runs. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.resync.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of resync trigger runs. Available in Cilium v1.9+
Shown as request
cilium.operator.ipam.resync.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of resync trigger runs. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.resync.total
(count)
[OpenMetrics V1] Number of resync operations to synchronize and resolve IP deficit of nodes. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.resync.count
(count)
[OpenMetrics V2] Number of resync operations to synchronize and resolve IP deficit of nodes. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.resync.queued.total
(count)
[OpenMetrics V1] Number of IPAM queued triggers. Available in Cilium v1.9+
Shown as unit
cilium.operator.ipam.resync.queued.count
(count)
[OpenMetrics V2] Number of IPAM queued triggers. Available in Cilium v1.9+
Shown as unit
cilium.operator.ipam.resync.folds
(gauge)
[OpenMetrics V1 and V2] Current level of resync folding. Available in Cilium v1.9+
Shown as unit
cilium.operator.ipam.resync.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of resync latency between queue and trigger run. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.resync.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of resync latency between queue and trigger run. Available in Cilium v1.9+
Shown as operation
cilium.operator.ipam.resync.latency.seconds.bucket
(count)
[OpenMetrics V2] Sample of resync latency between queue and trigger run. Available in Cilium v1.9+
Shown as second
cilium.operator.azure.api.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of interactions with the Azure API. Available in Cilium v1.9+
Shown as second
cilium.operator.azure.api.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of interactions with the Azure API. Available in Cilium v1.9+
Shown as request
cilium.operator.azure.api.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of interactions with the Azure API. Available in Cilium v1.9+
Shown as second
cilium.operator.azure.api.rate_limit.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of client-side rate limiter blocking when interacting with the Azure API. Available in Cilium v1.9+
Shown as second
cilium.operator.azure.api.rate_limit.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of client-side rate limiter blocking when interacting with the Azure API. Available in Cilium v1.9+
Shown as request
cilium.operator.azure.api.rate_limit.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of client-side rate limiter blocking when interacting with the Azure API. Available in Cilium v1.9+
Shown as second
cilium.operator.ec2.api.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of interactions with the AWS EC2 API. Available in Cilium v1.9+
Shown as second
cilium.operator.ec2.api.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of interactions with the AWS EC2 API. Available in Cilium v1.9+
Shown as request
cilium.operator.ec2.api.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of interactions with the AWS EC2 API. Available in Cilium v1.9+
Shown as second
cilium.operator.ec2.api.rate_limit.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of client-side rate limiter blocking when interacting with the AWS EC2 API. Available in Cilium v1.9+
Shown as second
cilium.operator.ec2.api.rate_limit.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of client-side rate limiter blocking when interacting with the AWS EC2 API. Available in Cilium v1.9+
Shown as request
cilium.operator.ec2.api.rate_limit.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of client-side rate limiter blocking when interacting with the AWS EC2 API. Available in Cilium v1.9+
Shown as second
cilium.operator.ces.queueing_delay.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of CiliumEndpointSlice queueing delay in seconds. Available in Cilium v1.11+
Shown as second
cilium.operator.ces.queueing_delay.seconds.count
(count)
[OpenMetrics V1 and V2] Count of CiliumEndpointSlice queueing delay in seconds. Available in Cilium v1.11+
Shown as unit
cilium.operator.ces.queueing_delay.seconds.bucket
(count)
[OpenMetrics V2] Sample of CiliumEndpointSlice queueing delay in seconds. Available in Cilium v1.11+
Shown as second
cilium.operator.ces.sync_errors.total
(count)
[OpenMetrics V1] Number of CES sync errors. Available in Cilium v1.11+
Shown as error
cilium.operator.ces.sync_errors.count
(count)
[OpenMetrics V2] Number of CES sync errors. Available in Cilium v1.11+
Shown as error
cilium.operator.identity_gc.entries
(gauge)
[OpenMetrics V1 and V2] The number of alive and deleted identities at the end of a garbage collector run. Available in Cilium v1.11+
Shown as garbage collection
cilium.operator.identity_gc.runs
(gauge)
[OpenMetrics V1 and V2] The number of times identity garbage collector has run. Available in Cilium v1.11+
Shown as garbage collection
cilium.operator.num_ceps_per_ces.sum
(count)
[OpenMetrics V1 and V2] Sum of CEPs batched in a CES. Available in Cilium v1.11+
Shown as unit
cilium.operator.num_ceps_per_ces.count
(count)
[OpenMetrics V1 and V2] Count of CEPs batched in a CES. Available in Cilium v1.11+
Shown as unit
cilium.operator.num_ceps_per_ces.bucket
(count)
[OpenMetrics V2] Sample of CEPs batched in a CES. Available in Cilium v1.11+
Shown as unit
cilium.operator.ipam.interface_candidates
(count)
[OpenMetrics V1] Number of attached interfaces with IPs available for allocation. Available in Cilium v1.13+
cilium.operator.ipam.interface_candidates.count
(count)
[OpenMetrics V2] Number of attached interfaces with IPs available for allocation. Available in Cilium v1.13+
cilium.operator.ipam.empty_interface_slots
(count)
[OpenMetrics V1] Number of empty interface slots available for interfaces to be attached. Available in Cilium v1.13+
cilium.operator.ipam.empty_interface_slots.count
(count)
[OpenMetrics V2] Number of empty interface slots available for interfaces to be attached. Available in Cilium v1.13+
cilium.operator.ipam.ip_allocation_ops
(count)
[OpenMetrics V1] Number of IP allocation operations. Available in Cilium v1.13+
cilium.operator.ipam.ip_allocation_ops.count
(count)
[OpenMetrics V2] Number of IP allocation operations. Available in Cilium v1.13+

イベント

Cilium インテグレーションには、イベントは含まれません。

サービスのチェック

cilium.prometheus.health
Returns CRITICAL if the check cannot access the metrics endpoint. Returns OK otherwise.
Statuses: ok, critical

cilium.openmetrics.health
Returns CRITICAL if the Agent is unable to connect to the OpenMetrics endpoint, otherwise returns OK.
Statuses: ok, critical

トラブルシューティング

ご不明な点は、Datadog のサポートチームまでお問い合わせください。