Cilium

Supported OS Linux Mac OS Windows

Integration version5.0.0

Overview

This check monitors Cilium through the Datadog Agent. The integration can either collect metrics from the cilium-agent or cilium-operator.

Setup

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.

Installation

The Cilium check is included in the Datadog Agent package, but it requires additional setup steps to expose Prometheus metrics.

Starting with version 1.10.0, this OpenMetrics-based integration has a latest mode (use_openmetrics: true) and a legacy mode (use_openmetrics: false). To get all the most up-to-date features, Datadog recommends enabling the latest mode. For more information, see Latest and Legacy Versioning For OpenMetrics-based Integrations.

  1. In order to enable Prometheus metrics in both the cilium-agent and cilium-operator, deploy Cilium with the following Helm values set according to your version of Cilium:
    • Cilium < v1.8.x: global.prometheus.enabled=true
    • Cilium >= v1.8.x and < v1.9.x: global.prometheus.enabled=true and global.operatorPrometheus.enabled=true
    • Cilium >= 1.9.x: prometheus.enabled=true and operator.prometheus.enabled=true

Or, separately enable Prometheus metrics in the Kubernetes manifests:

For Cilium <= v1.11, use --prometheus-serve-addr=:9090.
  • In the cilium-agent add --prometheus-serve-addr=:9962 to the args section of the Cilium DaemonSet config:

    # [...]
    spec:
      containers:
        - args:
            - --prometheus-serve-addr=:9962
    
  • In the cilium-operator add --enable-metrics to the args section of the Cilium deployment config:

    # [...]
    spec:
      containers:
        - args:
            - --enable-metrics
    

Configuration

Host

To configure this check for an Agent running on a host:

  1. Edit the cilium.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your Cilium performance data. See the sample cilium.d/conf.yaml for all available configuration options.

    • To collect cilium-agent metrics, enable the agent_endpoint option.
    • To collect cilium-operator metrics, enable the operator_endpoint option.
        instances:
    
            ## @param use_openmetrics - boolean - optional - default: false
            ## Use the latest OpenMetrics implementation for more features and better performance.
            ##
            ## Note: To see the configuration options for the legacy OpenMetrics implementation (Agent 7.33 or older),
            ## see https://github.com/DataDog/integrations-core/blob/7.33.x/cilium/datadog_checks/cilium/data/conf.yaml.example
            #
          - use_openmetrics: true # Enables OpenMetrics latest mode
    
            ## @param agent_endpoint - string - optional
            ## The URL where your application metrics are exposed by Prometheus.
            ## By default, the Cilium integration collects `cilium-agent` metrics.
            ## One of agent_endpoint or operator_endpoint must be provided.
            #
            agent_endpoint: http://localhost:9090/metrics
    
            ## @param operator_endpoint - string - optional
            ## Provide instead of `agent_endpoint` to collect `cilium-operator` metrics.
            ## Cilium operator metrics are exposed on port 6942.
            #
            operator_endpoint: http://localhost:6942/metrics
    
  2. Restart the Agent.

Log collection

Cilium contains two types of logs: cilium-agent and cilium-operator.

  1. Collecting logs is disabled by default in the Datadog Agent. Enable it in your DaemonSet configuration:

      # (...)
        env:
        #  (...)
          - name: DD_LOGS_ENABLED
              value: "true"
          - name: DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL
              value: "true"
      # (...)
    
  2. Mount the Docker socket to the Datadog Agent through the manifest or mount the /var/log/pods directory if you are not using Docker. For example manifests see the Kubernetes Installation instructions for DaemonSet.

  3. Restart the Agent.

Containerized

For containerized environments, see the Autodiscovery Integration Templates for guidance on applying the parameters below.

Collecting logs is disabled by default in the Datadog Agent. To enable it, see Kubernetes Log Collection.

To collect cilium-agent metrics and logs:
  • Metric collection
ParameterValue
<INTEGRATION_NAME>"cilium"
<INIT_CONFIG>blank or {}
<INSTANCE_CONFIG>{"agent_endpoint": "http://%%host%%:9090/metrics", "use_openmetrics": "true"}
  • Log collection
ParameterValue
<LOG_CONFIG>{"source": "cilium-agent", "service": "cilium-agent"}
To collect cilium-operator metrics and logs:
  • Metric collection
ParameterValue
<INTEGRATION_NAME>"cilium"
<INIT_CONFIG>blank or {}
<INSTANCE_CONFIG>{"operator_endpoint": "http://%%host%%:6942/metrics", "use_openmetrics": "true"}
  • Log collection
ParameterValue
<LOG_CONFIG>{"source": "cilium-operator", "service": "cilium-operator"}

Validation

Run the Agent’s status subcommand and look for cilium under the Checks section.

Data Collected

Metrics

cilium.
(gauge)
[OpenMetrics V1 and V2] How many seconds has the longest running processor for workqueue been running.
Shown as event
cilium.agent.api_process_time.seconds.bucket
(count)
[OpenMetrics V2] Amount of processing time for all API calls
Shown as second
cilium.agent.api_process_time.seconds.count
(count)
[OpenMetrics V1 and V2] Count of processing time for all API calls
Shown as request
cilium.agent.api_process_time.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of processing time for all API calls
Shown as second
cilium.agent.bootstrap.seconds.bucket
(count)
[OpenMetrics V2] Sample of bootstrap durations
Shown as second
cilium.agent.bootstrap.seconds.count
(count)
[OpenMetrics V1 and V2] Count of bootstrap durations
cilium.agent.bootstrap.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of bootstrap durations
Shown as second
cilium.api_limiter.adjustment_factor
(gauge)
[OpenMetrics V1 and V2] Most recent adjustment factor for automatic adjustment
Shown as second
cilium.api_limiter.processed_requests.count
(count)
[OpenMetrics V2] Total number of API requests processed
Shown as request
cilium.api_limiter.processed_requests.total
(count)
[OpenMetrics V1] Total number of API requests processed
Shown as request
cilium.api_limiter.processing_duration.seconds
(gauge)
[OpenMetrics V1 and V2] Mean and estimated processing duration in seconds
Shown as second
cilium.api_limiter.rate_limit
(gauge)
[OpenMetrics V1 and V2] Current rate limiting configuration (limit and burst)
Shown as request
cilium.api_limiter.requests_in_flight
(gauge)
[OpenMetrics V1 and V2] Current and maximum allowed number of requests in flight.
Shown as request
cilium.api_limiter.wait_duration.seconds
(gauge)
[OpenMetrics V1 and V2] Wait duration aggregated per api call
Shown as second
cilium.bpf.map.capacity
(gauge)
[OpenMetrics V1 and V2] Capacity of map tagged by map group. All maps with a capacity of 65536 are grouped under 'default'
cilium.bpf.map_ops.count
(count)
[OpenMetrics V2] Total BPF map operations performed
Shown as operation
cilium.bpf.map_ops.total
(count)
[OpenMetrics V1] Total BPF map operations performed
Shown as operation
cilium.bpf.map_pressure
(gauge)
[OpenMetrics V1 and V2] Map pressure defined as a ratio of the map usage compared to it
cilium.bpf.maps.virtual_memory.max.bytes
(gauge)
[OpenMetrics V1 and V2] Max memory used by eBPF maps installed in the system
Shown as byte
cilium.bpf.progs.virtual_memory.max.bytes
(gauge)
[OpenMetrics V1 and V2] Max memory used by eBPF programs installed in the system
Shown as byte
cilium.cidrgroup.policies
(gauge)
[OpenMetrics V1 and V2] Number of CNPs and CCNPs referencing at least one CiliumCIDRGroup
Shown as unit
cilium.cidrgroup.translation.time.stats.seconds
(gauge)
[OpenMetrics V1 and V2] CIDRGroup translation time stats
Shown as second
cilium.cidrgroups.referenced
(gauge)
[OpenMetrics V1 and V2] Number of CNPs and CCNPs referencing at least one CiliumCIDRGroup. CNPs with empty or non-existing CIDRGroupRefs are not considered
Shown as unit
cilium.controllers.failing.count
(gauge)
[OpenMetrics V1 and V2] Number of failing controllers
Shown as error
cilium.controllers.runs.count
(count)
[OpenMetrics V2] Total number of controller runs
Shown as event
cilium.controllers.runs.total
(count)
[OpenMetrics V1] Total number of controller runs
Shown as event
cilium.controllers.runs_duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of controller processes duration
Shown as second
cilium.controllers.runs_duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of controller processes duration
Shown as operation
cilium.controllers.runs_duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of controller processes duration
Shown as second
cilium.datapath.conntrack_dump.resets.count
(count)
[OpenMetrics V2] Number of conntrack dump resets. Happens when a BPF entry gets removed while dumping the map is in progress
cilium.datapath.conntrack_dump.resets.total
(count)
[OpenMetrics V1] Number of conntrack dump resets. Happens when a BPF entry gets removed while dumping the map is in progress
cilium.datapath.conntrack_gc.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of garbage collector process duration
Shown as second
cilium.datapath.conntrack_gc.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of garbage collector process duration
Shown as operation
cilium.datapath.conntrack_gc.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of garbage collector process duration
Shown as second
cilium.datapath.conntrack_gc.entries
(gauge)
[OpenMetrics V1 and V2] The number of alive and deleted conntrack entries
Shown as garbage collection
cilium.datapath.conntrack_gc.key_fallbacks.count
(count)
[OpenMetrics V2] The total number of conntrack entries.
Shown as garbage collection
cilium.datapath.conntrack_gc.key_fallbacks.total
(count)
[OpenMetrics V1] The total number of conntrack entries
Shown as garbage collection
cilium.datapath.conntrack_gc.runs.count
(count)
[OpenMetrics V2] Total number of the conntrack garbage collector process runs
Shown as garbage collection
cilium.datapath.conntrack_gc.runs.total
(count)
[OpenMetrics V1] Total number of the conntrack garbage collector process runs
Shown as garbage collection
cilium.datapath.errors.count
(count)
[OpenMetrics V2] Total number of errors in datapath management. Available in Cilium <= v1.9
Shown as error
cilium.datapath.errors.total
(count)
[OpenMetrics V1] Total number of errors in datapath management. Available in Cilium <= v1.9
Shown as error
cilium.drop_bytes.count
(count)
[OpenMetrics V2] Total dropped bytes
Shown as byte
cilium.drop_bytes.total
(count)
[OpenMetrics V1] Total dropped bytes
Shown as byte
cilium.drop_count.count
(count)
[OpenMetrics V2] Total dropped packets
Shown as packet
cilium.drop_count.total
(count)
[OpenMetrics V1] Total dropped packets
Shown as packet
cilium.endpoint.count
(gauge)
[OpenMetrics V1 and V2] Total ready endpoints managed by agent
Shown as unit
cilium.endpoint.max_ifindex
(gauge)
[OpenMetrics V1 and V2] Maximum interface index observed for existing endpoints
Shown as unit
cilium.endpoint.regeneration_time_stats.seconds.bucket
(count)
[OpenMetrics V2] Sample of endpoint regeneration time stats
Shown as second
cilium.endpoint.regeneration_time_stats.seconds.count
(count)
[OpenMetrics V1 and V2] Count of endpoint regeneration time stats
Shown as operation
cilium.endpoint.regeneration_time_stats.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of endpoint regeneration time stats
Shown as second
cilium.endpoint.regenerations.count
(count)
[OpenMetrics V2] Count of completed endpoint regenerations
Shown as unit
cilium.endpoint.regenerations.total
(count)
[OpenMetrics V1] Count of completed endpoint regenerations
Shown as unit
cilium.endpoint.state
(gauge)
[OpenMetrics V1 and V2] Count of all endpoints
Shown as unit
cilium.errors_warning.count
(count)
[OpenMetrics V2] Total error warnings
Shown as error
cilium.errors_warning.total
(count)
[OpenMetrics V1] Total error warnings
Shown as error
cilium.event_timestamp
(gauge)
[OpenMetrics V1 and V2] Last timestamp of event received
Shown as time
cilium.forward_bytes.count
(count)
[OpenMetrics V2] Total forwarded bytes
Shown as byte
cilium.forward_bytes.total
(count)
[OpenMetrics V1] Total forwarded bytes
Shown as byte
cilium.forward_count.count
(count)
[OpenMetrics V2] Total forwarded packets
Shown as packet
cilium.forward_count.total
(count)
[OpenMetrics V1] Total forwarded packets
Shown as packet
cilium.fqdn.active_ips
(gauge)
Number of IPs inside the DNS cache associated with a domain that has not expired (by TTL), per endpoint. Available in Cilium v1.12+
Shown as item
cilium.fqdn.active_names
(gauge)
Number of domains inside the DNS cache that have not expired (by TTL), per endpoint. Available in Cilium v1.12+
Shown as item
cilium.fqdn.alive_zombie_connections
(gauge)
Number of IPs associated with domains that have expired (by TTL) yet still associated with an active connection (aka zombie), per endpoint. Available in Cilium v1.12+
Shown as item
cilium.fqdn.gc_deletions.count
(count)
[OpenMetrics V2] Total number of FQDNs cleaned in FQDN garbage collector job.
Shown as event
cilium.fqdn.gc_deletions.total
(count)
[OpenMetrics V1] Total number of FQDNs cleaned in FQDN garbage collector job.
Shown as event
cilium.hive.status
(gauge)
[OpenMetrics V1 and V2] Counts of health status levels of Hive components
Shown as item
cilium.identity.count
(gauge)
[OpenMetrics V1 and V2] Number of identities allocate.
Shown as unit
cilium.ip_addresses.count
(gauge)
[OpenMetrics V1 and V2] Number of allocated ip_addresses
Shown as unit
cilium.ipam.capacity
(gauge)
[OpenMetrics V1 and V2] Total number of IPs in the IPAM pool labeled by family
Shown as event
cilium.ipam.events.count
(count)
[OpenMetrics V2] Number of IPAM events received by action and datapath family type
Shown as event
cilium.ipam.events.total
(count)
[OpenMetrics V1] Number of IPAM events received by action and datapath family type
Shown as event
cilium.ipcache.errors.count
(count)
[OpenMetrics V2] Number of errors interacting with the ipcache
Shown as error
cilium.ipcache.errors.total
(count)
[OpenMetrics V1] Number of errors interacting with the ipcache
Shown as error
cilium.k8s.workqueue.adds.total
(count)
[OpenMetrics V1 and V2] Total number of adds handled by workqueue
Shown as event
cilium.k8s.workqueue.depth
(gauge)
[OpenMetrics V1 and V2] Current depth of workqueue
Shown as event
cilium.k8s.workqueue.longest.running.processor.seconds
(gauge)
[OpenMetrics V1 and V2] How many seconds has the longest running processor for workqueue been running
Shown as event
cilium.k8s.workqueue.queue.duration.seconds.bucket
(count)
[OpenMetrics V1 and V2] Count of how long in seconds an item stays in workqueue before being requested
Shown as event
cilium.k8s.workqueue.queue.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of how long in seconds an item stays in workqueue before being requested
Shown as event
cilium.k8s.workqueue.queue.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of how long in seconds an item stays in workqueue before being requested
Shown as event
cilium.k8s.workqueue.retries.total
(count)
[OpenMetrics V1 and V2] Total number of retries handled by workqueue
Shown as event
cilium.k8s.workqueue.unfinished.work.seconds
(gauge)
[OpenMetrics V1 and V2] How many seconds of work has been done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases.
Shown as event
cilium.k8s_client.api_calls.count
(count)
[OpenMetrics V1 and V2] Number of API calls made to kube-apiserver. Available in Cilium v1.10+
Shown as request
cilium.k8s_client.api_latency_time.seconds.bucket
(count)
[OpenMetrics V2] Sample of processed API call duration
Shown as second
cilium.k8s_client.api_latency_time.seconds.count
(count)
[OpenMetrics V1 and V2] Count of processed API call duration
Shown as request
cilium.k8s_client.api_latency_time.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of processed API call duration
Shown as second
cilium.k8s_client.rate_limiter_duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of processed rate limiter call duration
Shown as second
cilium.k8s_client.rate_limiter_duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of processed rate limiter call duration
Shown as request
cilium.k8s_client.rate_limiter_duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of processed rate limiter call duration
Shown as second
cilium.k8s_event.lag.seconds
(gauge)
[OpenMetrics V1 and v2] Lag for Kubernetes events - computed value between receiving a CNI ADD event from kubelet and a Pod event received from kube-api-server
Shown as second
cilium.k8s_terminating.endpoints_events.count
(count)
[OpenMetrics V2] Number of terminating endpoint events received from Kubernetes
Shown as event
cilium.k8s_terminating.endpoints_events.total
(count)
[OpenMetrics V1] Number of terminating endpoint events received from Kubernetes
Shown as event
cilium.kubernetes.events.count
(count)
[OpenMetrics V2] Number of Kubernetes events processed
Shown as event
cilium.kubernetes.events.total
(count)
[OpenMetrics V1] Number of Kubernetes events processed
Shown as event
cilium.kubernetes.events_received.count
(count)
[OpenMetrics V2] Number of Kubernetes received events processed
Shown as event
cilium.kubernetes.events_received.total
(count)
[OpenMetrics V1] Number of Kubernetes received events processed
Shown as event
cilium.kvstore.events_queue.seconds.bucket
(count)
[OpenMetrics V2] Sum of duration in seconds received event was blocked before it could be queued
Shown as second
cilium.kvstore.events_queue.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration in seconds of received event was blocked before it could be queued
cilium.kvstore.events_queue.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration in seconds received event was blocked before it could be queued
Shown as second
cilium.kvstore.initial_sync_completed
(gauge)
Whether the initial synchronization from/to the kvstore has completed
cilium.kvstore.operations_duration.seconds.bucket
(count)
[OpenMetrics V2] Duration of kvstore operation sample
Shown as second
cilium.kvstore.operations_duration.seconds.count
(count)
[OpenMetrics V1 and V2] Duration of kvstore operation count
Shown as operation
cilium.kvstore.operations_duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Duration of kvstore operation sum
Shown as second
cilium.kvstore.quorum_errors.count
(count)
[OpenMetrics V2] Number of quorum errors
Shown as error
cilium.kvstore.quorum_errors.total
(count)
[OpenMetrics V1] Number of quorum errors
Shown as error
cilium.kvstore.sync_queue_size
(gauge)
Number of elements queued for synchronization in the kvstore
Shown as item
cilium.nodes.all_datapath_validations.count
(count)
[OpenMetrics V2] Number of validation calls to implement the datapath implementation of a node
Shown as unit
cilium.nodes.all_datapath_validations.total
(count)
[OpenMetrics V1] Number of validation calls to implement the datapath implementation of a node
Shown as unit
cilium.nodes.all_events_received.count
(count)
[OpenMetrics V2] Number of node events received
Shown as event
cilium.nodes.all_events_received.total
(count)
[OpenMetrics V1] Number of node events received.
Shown as event
cilium.nodes.managed.total
(gauge)
[OpenMetrics V1 and V2] Number of nodes managed
Shown as node
cilium.operator.azure.api.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of interactions with the Azure API. Available in Cilium v1.9+
Shown as second
cilium.operator.azure.api.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of interactions with the Azure API. Available in Cilium v1.9+
Shown as request
cilium.operator.azure.api.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of interactions with the Azure API. Available in Cilium v1.9+
Shown as second
cilium.operator.azure.api.rate_limit.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of client-side rate limiter blocking when interacting with the Azure API. Available in Cilium v1.9+
Shown as second
cilium.operator.azure.api.rate_limit.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of client-side rate limiter blocking when interacting with the Azure API. Available in Cilium v1.9+
Shown as request
cilium.operator.azure.api.rate_limit.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of client-side rate limiter blocking when interacting with the Azure API. Available in Cilium v1.9+
Shown as second
cilium.operator.ces.queueing_delay.seconds.bucket
(count)
[OpenMetrics V2] Sample of CiliumEndpointSlice queueing delay in seconds. Available in Cilium v1.11+
Shown as second
cilium.operator.ces.queueing_delay.seconds.count
(count)
[OpenMetrics V1 and V2] Count of CiliumEndpointSlice queueing delay in seconds. Available in Cilium v1.11+
Shown as unit
cilium.operator.ces.queueing_delay.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of CiliumEndpointSlice queueing delay in seconds. Available in Cilium v1.11+
Shown as second
cilium.operator.ces.sync.total
(count)
[OpenMetrics V1 and V2] Number of completed CES syncs by outcome
Shown as unit
cilium.operator.ces.sync_errors.count
(count)
[OpenMetrics V2] Number of CES sync errors. Available in Cilium v1.11+
Shown as error
cilium.operator.ces.sync_errors.total
(count)
[OpenMetrics V1] Number of CES sync errors. Available in Cilium v1.11+
Shown as error
cilium.operator.ec2.api.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of interactions with the AWS EC2 API. Available in Cilium v1.9+
Shown as second
cilium.operator.ec2.api.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of interactions with the AWS EC2 API. Available in Cilium v1.9+
Shown as request
cilium.operator.ec2.api.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of interactions with the AWS EC2 API. Available in Cilium v1.9+
Shown as second
cilium.operator.ec2.api.rate_limit.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of client-side rate limiter blocking when interacting with the AWS EC2 API. Available in Cilium v1.9+
Shown as second
cilium.operator.ec2.api.rate_limit.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of client-side rate limiter blocking when interacting with the AWS EC2 API. Available in Cilium v1.9+
Shown as request
cilium.operator.ec2.api.rate_limit.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of client-side rate limiter blocking when interacting with the AWS EC2 API. Available in Cilium v1.9+
Shown as second
cilium.operator.eni.available
(gauge)
[OpenMetrics V2] Number of available IPs per subnet ID. Available in Cilium <= v1.8
Shown as unit
cilium.operator.eni.available.ips_per_subnet
(gauge)
[OpenMetrics V1 and V2] Number of available IPs per subnet ID. Available in Cilium <= v1.8
Shown as unit
cilium.operator.eni.aws_api_duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of interactions with AWS API. Available in Cilium <= v1.8
Shown as second
cilium.operator.eni.aws_api_duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of interactions with AWS API. Available in Cilium <= v1.8
Shown as request
cilium.operator.eni.aws_api_duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of interactions with AWS API. Available in Cilium <= v1.8
Shown as second
cilium.operator.eni.deficit_resolver.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of deficit resolver trigger runs
Shown as second
cilium.operator.eni.deficit_resolver.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of deficit resolver trigger runs
Shown as operation
cilium.operator.eni.deficit_resolver.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of deficit resolver trigger runs
Shown as second
cilium.operator.eni.deficit_resolver.folds
(gauge)
[OpenMetrics V1 and V2] Current level of deficit resolver folding
Shown as unit
cilium.operator.eni.deficit_resolver.latency.seconds.bucket
(count)
[OpenMetrics V2] Sample of latency between deficit resolver queue and trigger run
Shown as second
cilium.operator.eni.deficit_resolver.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of latency between deficit resolver queue and trigger run
Shown as operation
cilium.operator.eni.deficit_resolver.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of latency between deficit resolver queue and trigger run
Shown as second
cilium.operator.eni.deficit_resolver.queued.count
(count)
[OpenMetrics V2] Number of queued deficit resolver triggers
Shown as event
cilium.operator.eni.deficit_resolver.queued.total
(gauge)
[OpenMetrics V1] Number of queued deficit resolver triggers
Shown as event
cilium.operator.eni.ec2_resync.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of ec2 resync trigger runs. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.ec2_resync.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of ec2 resync trigger runs. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.ec2_resync.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of ec2 resync trigger runs. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.ec2_resync.folds
(gauge)
[OpenMetrics V1 and V2] Current level of ec2 resync folding. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.ec2_resync.latency.seconds.bucket
(count)
[OpenMetrics V2] Sample of latency between ec2 resync queue and trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.ec2_resync.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of latency between ec2 resync queue and trigger run. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.ec2_resync.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of latency between ec2 resync queue and trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.ec2_resync.queued.count
(count)
[OpenMetrics V2] Number of queued ec2 resync triggers. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.ec2_resync.queued.total
(gauge)
[OpenMetrics V1] Number of queued ec2 resync triggers. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.interface_creation_ops
(count)
[OpenMetrics V1] Number of ENIs allocated. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.interface_creation_ops.count
(count)
[OpenMetrics V2] Number of ENIs allocated. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.ips.total
(gauge)
[OpenMetrics V1 and V2] Number of IPs allocated. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.k8s_sync.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of k8s sync trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.k8s_sync.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of k8s sync trigger run. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.k8s_sync.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of k8s sync trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.k8s_sync.folds
(gauge)
[OpenMetrics V1 and V2] Current level of k8s sync folding. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.k8s_sync.latency.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of k8s sync latency between queue and trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.k8s_sync.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of k8s sync latency between queue and trigger run. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.k8s_sync.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of k8s sync latency between queue and trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.k8s_sync.queued.count
(count)
[OpenMetrics V2] Number of queued k8s sync triggers. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.k8s_sync.queued.total
(gauge)
[OpenMetrics V1] Number of queued k8s sync triggers. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.nodes.total
(gauge)
[OpenMetrics V1] Number of nodes by category. Available in Cilium <= v1.9
Shown as node
cilium.operator.eni.resync.count
(count)
[OpenMetrics V2] Number of resync operations to synchronize AWS EC2 metadata. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.resync.total
(count)
[OpenMetrics V1] Number of resync operations to synchronize AWS EC2 metadata. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni_ec2.rate_limit.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of client-side rate limiter blocking. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni_ec2.rate_limit.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of client-side rate limiter blocking. Available in Cilium <= v1.9
Shown as request
cilium.operator.eni_ec2.rate_limit.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of client-side rate limiter blocking. Available in Cilium <= v1.9
Shown as second
cilium.operator.errors.warnings.total
(count)
[OpenMetrics V1 and V2] Number of total errors in cilium-operator instances
Shown as item
cilium.operator.hive.status
(gauge)
[OpenMetrics V1 and V2] Counts of health status levels of Hive components
Shown as item
cilium.operator.identity_gc.entries
(gauge)
[OpenMetrics V1 and V2] The number of alive and deleted identities at the end of a garbage collector run. Available in Cilium v1.11+
Shown as garbage collection
cilium.operator.identity_gc.runs
(gauge)
[OpenMetrics V1 and V2] The number of times identity garbage collector has run. Available in Cilium v1.11+
Shown as garbage collection
cilium.operator.ipam.allocation.duration.seconds.bucket
(count)
[OpenMetrics V2] Allocation ip or interface latency in seconds
Shown as second
cilium.operator.ipam.allocation.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Allocation ip or interface latency in seconds
Shown as operation
cilium.operator.ipam.allocation.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Allocation ip or interface latency in seconds
Shown as second
cilium.operator.ipam.allocation_ops
(count)
[OpenMetrics V1] Count of IP allocation operations. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.allocation_ops.count
(count)
[OpenMetrics V2] Count of IP allocation operations. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.api.duration.seconds.bucket
(count)
[OpenMetricsV2] Duration of interactions with external IPAM API. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.api.duration.seconds.count
(count)
[OpenMetricsV1 and V2] Duration of interactions with external IPAM API. Available in Cilium v1.9+
Shown as request
cilium.operator.ipam.api.duration.seconds.sum
(count)
[OpenMetricsV1 and V2] Duration of interactions with external IPAM API. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.api.rate_limit.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of rate limiting while accessing external IPAM API. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.api.rate_limit.duration.seconds.count
(count)
[OpenMetricsV1 and V2] Count of duration of rate limiting while accessing external IPAM API. Available in Cilium v1.9+
Shown as request
cilium.operator.ipam.api.rate_limit.duration.seconds.sum
(count)
[OpenMetricsV1 and V2] Sum of duration of rate limiting while accessing external IPAM API. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.available
(gauge)
[OpenMetrics V1 and V2] Number of interfaces with addresses available. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.available.ips_per_subnet
(gauge)
[OpenMetrics V1 and V2] Number of available IPs per subnet ID. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.available_interfaces
(gauge)
[OpenMetrics V1 and V2] Number of interfaces with addresses available
Shown as unit
cilium.operator.ipam.available_ips
(gauge)
[OpenMetrics V1 and V2] Total available IPs on Node for IPAM allocation
Shown as unit
cilium.operator.ipam.deficit_resolver.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of deficit resolver trigger runs. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.deficit_resolver.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of deficit resolver trigger runs. Available in Cilium v1.8+
Shown as request
cilium.operator.ipam.deficit_resolver.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of deficit resolver trigger runs. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.deficit_resolver.folds
(gauge)
[OpenMetrics V1 and V2] Current level of deficit resolver folding. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.deficit_resolver.latency.seconds.bucket
(count)
[OpenMetrics V2] Sample of deficit resolver latency between queue and trigger run. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.deficit_resolver.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of deficit resolver latency between queue and trigger run. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.deficit_resolver.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of deficit resolver latency between queue and trigger run. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.deficit_resolver.queued.count
(count)
[OpenMetrics V2] Number of queued triggers. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.deficit_resolver.queued.total
(count)
[OpenMetrics V1] Number of queued triggers. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.empty_interface_slots
(count)
[OpenMetrics V1] Number of empty interface slots available for interfaces to be attached. Available in Cilium v1.13+
cilium.operator.ipam.empty_interface_slots.count
(count)
[OpenMetrics V2] Number of empty interface slots available for interfaces to be attached. Available in Cilium v1.13+
cilium.operator.ipam.interface_candidates
(count)
[OpenMetrics V1] Number of attached interfaces with IPs available for allocation. Available in Cilium v1.13+
cilium.operator.ipam.interface_candidates.count
(count)
[OpenMetrics V2] Number of attached interfaces with IPs available for allocation. Available in Cilium v1.13+
cilium.operator.ipam.interface_creation_ops
(count)
[OpenMetrics V1] Count of interfaces allocated. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.interface_creation_ops.count
(count)
[OpenMetrics V2] Count of interfaces allocated. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.ip_allocation_ops
(count)
[OpenMetrics V1] Number of IP allocation operations. Available in Cilium v1.13+
cilium.operator.ipam.ip_allocation_ops.count
(count)
[OpenMetrics V2] Number of IP allocation operations. Available in Cilium v1.13+
cilium.operator.ipam.ip_release_ops
(count)
[OpenMetrics V1] Number of IP release operations
Shown as operation
cilium.operator.ipam.ip_release_ops.count
(count)
[OpenMetrics V2] Number of IP release operations
Shown as operation
cilium.operator.ipam.ips
(gauge)
[OpenMetrics V1 and V2] Number of IPs allocated. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.k8s_sync.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of K8s sync trigger runs. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.k8s_sync.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of K8s sync trigger runs. Available in Cilium v1.8+
Shown as request
cilium.operator.ipam.k8s_sync.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of K8s sync trigger runs. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.k8s_sync.folds
(gauge)
[OpenMetrics V1 and V2] Current level of K8s sync folding. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.k8s_sync.latency.seconds.bucket
(count)
[OpenMetrics V2] Count of K8s sync latency between queue and trigger run. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.k8s_sync.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of K8s sync latency between queue and trigger run. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.k8s_sync.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of K8s sync latency between queue and trigger run. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.k8s_sync.queued.count
(count)
[OpenMetrics V2] Number of queued k8s sync triggers. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.k8s_sync.queued.total
(count)
[OpenMetrics V1] Number of queued k8s sync triggers. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.needed_ips
(gauge)
[OpenMetrics V1 and V2] Number of IPs that are needed on the Node to satisfy IPAM allocation requests
Shown as unit
cilium.operator.ipam.nodes
(gauge)
[OpenMetrics V1 and V2] Number of nodes by category. Available in Cilium v1.8+
Shown as node
cilium.operator.ipam.release.duration.seconds.bucket
(count)
[OpenMetrics V2] Release ip or interface latency in seconds
Shown as second
cilium.operator.ipam.release.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Release ip or interface latency in seconds
Shown as operation
cilium.operator.ipam.release.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Release ip or interface latency in seconds
Shown as second
cilium.operator.ipam.release_ops
(count)
[OpenMetrics V1] Count of IP release operations. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.release_ops.count
(count)
[OpenMetrics V2] Count of IP release operations. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.resync.count
(count)
[OpenMetrics V2] Number of resync operations to synchronize and resolve IP deficit of nodes. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.resync.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of resync trigger runs. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.resync.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of resync trigger runs. Available in Cilium v1.9+
Shown as request
cilium.operator.ipam.resync.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of resync trigger runs. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.resync.folds
(gauge)
[OpenMetrics V1 and V2] Current level of resync folding. Available in Cilium v1.9+
Shown as unit
cilium.operator.ipam.resync.latency.seconds.bucket
(count)
[OpenMetrics V2] Sample of resync latency between queue and trigger run. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.resync.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of resync latency between queue and trigger run. Available in Cilium v1.9+
Shown as operation
cilium.operator.ipam.resync.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of resync latency between queue and trigger run. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.resync.queued.count
(count)
[OpenMetrics V2] Number of IPAM queued triggers. Available in Cilium v1.9+
Shown as unit
cilium.operator.ipam.resync.queued.total
(count)
[OpenMetrics V1] Number of IPAM queued triggers. Available in Cilium v1.9+
Shown as unit
cilium.operator.ipam.resync.total
(count)
[OpenMetrics V1] Number of resync operations to synchronize and resolve IP deficit of nodes. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.used_ips
(gauge)
[OpenMetrics V1 and V2] Total used IPs on Node for IPAM allocation
Shown as unit
cilium.operator.lbipam.conflicting.pools.total
(gauge)
[OpenMetrics V1 and V2] The number of conflicting pools
Shown as unit
cilium.operator.lbipam.ips.available.total
(gauge)
[OpenMetrics V1 and V2] The number of IP addresses available in a given pool
Shown as unit
cilium.operator.lbipam.ips.used.total
(gauge)
[OpenMetrics V1 and V2] The number of IP addresses used in a given pool
Shown as unit
cilium.operator.lbipam.services.matching.total
(gauge)
[OpenMetrics V1 and V2] The number of services matching pools
Shown as unit
cilium.operator.lbipam.services.unsatisfied.total
(gauge)
[OpenMetrics V1 and V2] The number of services which did not receive all requested IPs
Shown as unit
cilium.operator.num_ceps_per_ces.bucket
(count)
[OpenMetrics V2] Sample of CEPs batched in a CES. Available in Cilium v1.11+
Shown as unit
cilium.operator.num_ceps_per_ces.count
(count)
[OpenMetrics V1 and V2] Count of CEPs batched in a CES. Available in Cilium v1.11+
Shown as unit
cilium.operator.num_ceps_per_ces.sum
(count)
[OpenMetrics V1 and V2] Sum of CEPs batched in a CES. Available in Cilium v1.11+
Shown as unit
cilium.operator.process.cpu.seconds
(count)
[OpenMetrics V1] Total user and system CPU time spent in seconds
Shown as second
cilium.operator.process.cpu.seconds.count
(count)
[OpenMetrics V2] Total user and system CPU time spent in seconds
Shown as second
cilium.operator.process.max_fds
(gauge)
[OpenMetrics V1 and V2] Maximum number of open file descriptors
Shown as file
cilium.operator.process.open_fds
(gauge)
[OpenMetrics V1 and V2] Number of open file descriptors
Shown as file
cilium.operator.process.resident_memory.bytes
(gauge)
[OpenMetrics V1 and V2] Resident memory size in bytes
Shown as byte
cilium.operator.process.start_time.seconds
(gauge)
[OpenMetrics V1 and V2] Start time of the process since unix epoch in seconds
Shown as second
cilium.operator.process.virtual_memory.bytes
(gauge)
[OpenMetrics V1 and V2] Virtual memory size in bytes
Shown as byte
cilium.operator.process.virtual_memory_max.bytes
(gauge)
[OpenMetrics V1 and V2] Maximum amount of virtual memory available in bytes
Shown as byte
cilium.policy.change.count
(count)
[OpenMetrics V2] Number of policy changes by outcome
Shown as unit
cilium.policy.change.total
(count)
[OpenMetrics V1] Number of policy changes by outcome
Shown as unit
cilium.policy.count
(gauge)
[OpenMetrics V1 and V2] Number of policies currently loaded
Shown as unit
cilium.policy.endpoint_enforcement_status
(gauge)
[OpenMetrics V1 and V2] Number of endpoints labeled by policy enforcement status
Shown as unit
cilium.policy.implementation_delay.bucket
(count)
[OpenMetrics V1 and V2] Time between a policy change and it being fully deployed into the datapath
Shown as second
cilium.policy.implementation_delay.count
(count)
[OpenMetrics V1 and V2] Time between a policy change and it being fully deployed into the datapath
Shown as unit
cilium.policy.implementation_delay.sum
(count)
[OpenMetrics V1 and V2] Time between a policy change and it being fully deployed into the datapath
Shown as second
cilium.policy.import_errors.count
(count)
[OpenMetrics V1 and V2] Number of failed policy imports
Shown as error
cilium.policy.l7.count
(count)
[OpenMetrics V2] Number of total L7 requests/responses by type
Shown as unit
cilium.policy.l7.total
(count)
[OpenMetrics V1] Number of total L7 requests/responses by type
Shown as unit
cilium.policy.l7_denied.count
(count)
[OpenMetrics V2] Number of total L7 denied requests/responses due to policy. Available in Cilium <= v1.7
Shown as unit
cilium.policy.l7_denied.total
(count)
[OpenMetrics V1] Number of total L7 denied requests/responses due to policy. Available in Cilium <= v1.7
Shown as unit
cilium.policy.l7_forwarded.count
(count)
[OpenMetrics V2] Number of total L7 forwarded requests/response. Available in Cilium <= v1.7
Shown as unit
cilium.policy.l7_forwarded.total
(count)
[OpenMetrics V1] Number of total L7 forwarded requests/responses. Available in Cilium <= v1.7
Shown as unit
cilium.policy.l7_parse_errors.count
(count)
[OpenMetrics V2] Number of total L7 parse errors. Available in Cilium <= v1.7
Shown as error
cilium.policy.l7_parse_errors.total
(count)
[OpenMetrics V1] Number of total L7 parse errors. Available in Cilium <= v1.7
Shown as error
cilium.policy.l7_received.count
(count)
[OpenMetrics V2] Number of total L7 received requests/responses. Available in Cilium <= v1.7
Shown as unit
cilium.policy.l7_received.total
(count)
[OpenMetrics V1] Number of total L7 received requests/responses. Available in Cilium <= v1.7
Shown as unit
cilium.policy.max_revision
(gauge)
[OpenMetrics V1 and V2] Highest policy revision number in the agent
Shown as unit
cilium.policy.regeneration.count
(count)
[OpenMetrics V2] Total number of successful policy regenerations
Shown as unit
cilium.policy.regeneration.total
(count)
[OpenMetrics V1] Total number of successful policy regenerations
Shown as unit
cilium.policy.regeneration_time_stats.seconds.bucket
(count)
[OpenMetrics V2] Policy regeneration time stats sample
Shown as second
cilium.policy.regeneration_time_stats.seconds.count
(count)
[OpenMetrics V1 and V2] Policy regeneration time stats count
Shown as operation
cilium.policy.regeneration_time_stats.seconds.sum
(count)
[OpenMetrics V1 and V2] Policy regeneration time stats count
Shown as second
cilium.process.cpu.seconds.count
(count)
[OpenMetrics V2] Process CPU time in seconds
Shown as second
cilium.process.cpu.seconds.total
(gauge)
[OpenMetrics V1] Process CPU time in seconds
Shown as second
cilium.process.max_fds
(gauge)
[OpenMetrics V1 and V2] Process file descriptor maximum
Shown as file
cilium.process.open_fds
(gauge)
[OpenMetrics V1 and V2] Number of open file descriptors
Shown as file
cilium.process.resident_memory.bytes
(gauge)
[OpenMetrics V1 and V2] Total resident memory bytes
Shown as byte
cilium.process.start_time.seconds
(gauge)
[OpenMetrics V1 and V2] Processes start time
Shown as second
cilium.process.virtual_memory.bytes
(gauge)
[OpenMetrics V1 and V2] Virtual memory bytes
Shown as byte
cilium.process.virtual_memory.max.bytes
(gauge)
[OpenMetrics V1 and V2] Maximum virtual memory bytes
Shown as byte
cilium.proxy.datapath.update_timeout.count
(count)
[OpenMetrics V2] Number of total datapath update timeouts due to FQDN IP updates. Available in Cilium 1.10+
Shown as timeout
cilium.proxy.datapath.update_timeout.total
(count)
[OpenMetrics V1] Number of total datapath update timeouts due to FQDN IP updates. Available in Cilium 1.10+
Shown as timeout
cilium.proxy.redirects
(gauge)
Number of redirects installed for endpoints by protocol
cilium.proxy.upstream_reply.seconds.bucket
(count)
[OpenMetrics V2] Seconds waited for upstream server to reply to a request labeled by error, protocol and span time
Shown as second
cilium.proxy.upstream_reply.seconds.count
(count)
[OpenMetrics V1 and V2] Seconds waited for upstream server to reply to a request labeled by error, protocol and span time
Shown as second
cilium.proxy.upstream_reply.seconds.sum
(count)
[OpenMetrics V1 and V2] Seconds waited for upstream server to reply to a request labeled by error, protocol and span time
Shown as second
cilium.services.events.count
(count)
[OpenMetrics V2] Number of services events labeled by action type
Shown as event
cilium.services.events.total
(count)
[OpenMetrics V1] Number of services events labeled by action type
Shown as event
cilium.subprocess.start.count
(count)
[OpenMetrics V2] Number of times that Cilium has started a subprocess
Shown as unit
cilium.subprocess.start.total
(count)
[OpenMetrics V1] Number of times that Cilium has started a subprocess
Shown as unit
cilium.triggers_policy.update.count
(count)
[OpenMetrics V2] Total number of policy update trigger invocations
Shown as unit
cilium.triggers_policy.update.total
(count)
[OpenMetrics V1] Total number of policy update trigger invocations
Shown as unit
cilium.triggers_policy.update_call_duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of policy update trigger duration
Shown as second
cilium.triggers_policy.update_call_duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of policy update trigger duration
Shown as operation
cilium.triggers_policy.update_call_duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of policy update trigger duration
Shown as second
cilium.triggers_policy.update_folds
(gauge)
[OpenMetrics V1 and V2] Number of folds
Shown as unit
cilium.unreachable.health_endpoints
(gauge)
[OpenMetrics V1 and V2] Number of health endpoints that cannot be reached
Shown as unit
cilium.unreachable.nodes
(gauge)
[OpenMetrics V1 and V2] Number of nodes that cannot be reached
Shown as node
cilium.version
(gauge)
[OpenMetrics V1 and V2] Cilium version
Shown as node

Events

The Cilium integration does not include any events.

Service Checks

cilium.prometheus.health
Returns CRITICAL if the check cannot access the metrics endpoint. Returns OK otherwise.
Statuses: ok, critical

cilium.openmetrics.health
Returns CRITICAL if the Agent is unable to connect to the OpenMetrics endpoint, otherwise returns OK.
Statuses: ok, critical

Troubleshooting

Need help? Contact Datadog support.