Cilium

Supported OS Linux Mac OS Windows

Présentation

Ce check permet de surveiller Cilium avec l’Agent Datadog. L’intégration peut recueillir des métriques à partir de cilium-agent ou de cilium-operator.

Configuration

Suivez les instructions ci-dessous pour installer et configurer ce check lorsque l’Agent est exécuté sur un host. Consultez la documentation relative aux modèles d’intégration Autodiscovery pour découvrir comment appliquer ces instructions à des environnements conteneurisés.

Installation

Le check Cilium est inclus avec le package de l’Agent Datadog , mais des opérations d’installation supplémentaires sont nécessaires pour l’exposition des métriques Prometheus.

  1. Pour activer les métriques Prometheus dans cilium-agent et cilium-operator, déployez Cilium en définissant la valeur Helm global.prometheus.enabled=true.

  2. Vous pouvez également activer les métriques Prometheus séparément :

    • Dans le cilium-agent, ajoutez --prometheus-serve-addr=:9090 à la section args de la configuration DaemonSet pour Cilium :

      # [...]
      spec:
        containers:
          - args:
              - --prometheus-serve-addr=:9090
      
    • Sinon, dans le cilium-operator, ajoutez --enable-metrics à la section args de la configuration de déploiement de Cilium :

      # [...]
      spec:
        containers:
          - args:
              - --enable-metrics
      

Configuration

Host

Pour configurer ce check lorsque l’Agent est exécuté sur un host :

  1. Modifiez le fichier cilium.d/conf.yaml dans le dossier conf.d/ à la racine du répertoire de configuration de votre Agent pour commencer à recueillir vos données de performance Cilium. Consultez le fichier d’exemple cilium.d/conf.yaml pour découvrir toutes les options de configuration disponibles.

    • Pour recueillir les métriques cilium-agent activez l’option agent_endpoint.
    • Pour recueillir les métriques cilium-operator, activez l’option operator_endpoint.
  2. Redémarrez l’Agent .

Collecte de logs

Cilium génère deux types de logs : cilium-agent et cilium-operator.

  1. La collecte de logs est désactivée par défaut dans l’Agent Datadog. Vous devez l’activer dans votre configuration DaemonSet  :

      # (...)
        env:
        #  (...)
          - name: DD_LOGS_ENABLED
              value: "true"
          - name: DD_LOGS_CONFIG_CONTAINER_COLLECT_ALL
              value: "true"
      # (...)
    
  2. Montez le socket Docker sur l’Agent Datadog comme dans ce manifeste . Si vous n’utilisez pas Docker, montez le répertoire /var/log/pods.

  3. Redémarrez l’Agent .

Environnement conteneurisé

Consultez la documentation relative aux modèles d’intégration Autodiscovery pour découvrir comment appliquer les paramètres ci-dessous à un environnement conteneurisé.

Collecte de métriques
ParamètreValeur
<NOM_INTÉGRATION>cilium
<CONFIG_INIT>vide ou {}
<CONFIG_INSTANCE>{"agent_endpoint": "http://%%host%%:9090/metrics"}
Collecte de logs

La collecte des logs est désactivée par défaut dans l’Agent Datadog. Pour l’activer, consultez la section Collecte de logs avec Kubernetes .

ParamètreValeur
<CONFIG_LOG>{"source": "cilium-agent", "service": "cilium-agent"}

Validation

Lancez la sous-commande status de l’Agent et cherchez cilium dans la section Checks.

Données collectées

Métriques

cilium.agent.api_process_time.seconds.count
(count)
[OpenMetrics V1 and V2] Count of processing time for all API calls
Shown as request
cilium.agent.api_process_time.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of processing time for all API calls
Shown as second
cilium.agent.api_process_time.seconds.bucket
(count)
[OpenMetrics V2] Amount of processing time for all API calls
Shown as second
cilium.agent.bootstrap.seconds.count
(count)
[OpenMetrics V1 and V2] Count of bootstrap durations
cilium.agent.bootstrap.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of bootstrap durations
Shown as second
cilium.agent.bootstrap.seconds.bucket
(count)
[OpenMetrics V2] Sample of bootstrap durations
Shown as second
cilium.api_limiter.adjustment_factor
(gauge)
[OpenMetrics V1 and V2] Most recent adjustment factor for automatic adjustment
Shown as second
cilium.api_limiter.processed_requests.total
(count)
[OpenMetrics V1] Total number of API requests processed
Shown as request
cilium.api_limiter.processed_requests.count
(count)
[OpenMetrics V2] Total number of API requests processed
Shown as request
cilium.api_limiter.processing_duration.seconds
(gauge)
[OpenMetrics V1 and V2] Mean and estimated processing duration in seconds
Shown as second
cilium.api_limiter.rate_limit
(gauge)
[OpenMetrics V1 and V2] Current rate limiting configuration (limit and burst)
Shown as request
cilium.api_limiter.requests_in_flight
(gauge)
[OpenMetrics V1 and V2] Current and maximum allowed number of requests in flight.
Shown as request
cilium.api_limiter.wait_duration.seconds
(gauge)
[OpenMetrics V1 and V2] Wait duration aggregated per api call
Shown as second
cilium.bpf.map_ops.total
(count)
[OpenMetrics V1] Total BPF map operations performed
Shown as operation
cilium.bpf.map_ops.count
(count)
[OpenMetrics V2] Total BPF map operations performed
Shown as operation
cilium.bpf.map_pressure
(gauge)
[OpenMetrics V1 and V2] Map pressure defined as a ratio of the map usage compared to it
cilium.bpf.maps.virtual_memory.max.bytes
(gauge)
[OpenMetrics V1 and V2] Max memory used by eBPF maps installed in the system
Shown as byte
cilium.bpf.progs.virtual_memory.max.bytes
(gauge)
[OpenMetrics V1 and V2] Max memory used by eBPF programs installed in the system
Shown as byte
cilium.controllers.failing.count
(gauge)
[OpenMetrics V1 and V2] Number of failing controllers
Shown as error
cilium.controllers.runs_duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of controller processes duration
Shown as operation
cilium.controllers.runs_duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of controller processes duration
Shown as second
cilium.controllers.runs_duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of controller processes duration
Shown as second
cilium.controllers.runs.total
(count)
[OpenMetrics V1] Total number of controller runs
Shown as event
cilium.controllers.runs.count
(count)
[OpenMetrics V2] Total number of controller runs
Shown as event
cilium.datapath.conntrack_gc.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of garbage collector process duration
Shown as operation
cilium.datapath.conntrack_gc.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of garbage collector process duration
Shown as second
cilium.datapath.conntrack_gc.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of garbage collector process duration
Shown as second
cilium.datapath.conntrack_gc.entries
(gauge)
[OpenMetrics V1 and V2] The number of alive and deleted conntrack entries
Shown as garbage collection
cilium.datapath.conntrack_gc.key_fallbacks.total
(count)
[OpenMetrics V1] The total number of conntrack entries
Shown as garbage collection
cilium.datapath.conntrack_gc.key_fallbacks.count
(count)
[OpenMetrics V2] The total number of conntrack entries.
Shown as garbage collection
cilium.datapath.conntrack_gc.runs.total
(count)
[OpenMetrics V1] Total number of the conntrack garbage collector process runs
Shown as garbage collection
cilium.datapath.conntrack_gc.runs.count
(count)
[OpenMetrics V2] Total number of the conntrack garbage collector process runs
Shown as garbage collection
cilium.datapath.conntrack_dump.resets.total
(count)
[OpenMetrics V1] Number of conntrack dump resets. Happens when a BPF entry gets removed while dumping the map is in progress
cilium.datapath.conntrack_dump.resets.count
(count)
[OpenMetrics V2] Number of conntrack dump resets. Happens when a BPF entry gets removed while dumping the map is in progress
cilium.drop_bytes.total
(count)
[OpenMetrics V1] Total dropped bytes
Shown as byte
cilium.drop_bytes.count
(count)
[OpenMetrics V2] Total dropped bytes
Shown as byte
cilium.drop_count.total
(count)
[OpenMetrics V1] Total dropped packets
Shown as packet
cilium.drop_count.count
(count)
[OpenMetrics V2] Total dropped packets
Shown as packet
cilium.endpoint.regeneration_time_stats.seconds.count
(count)
[OpenMetrics V1 and V2] Count of endpoint regeneration time stats
Shown as operation
cilium.endpoint.regeneration_time_stats.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of endpoint regeneration time stats
Shown as second
cilium.endpoint.regeneration_time_stats.seconds.bucket
(count)
[OpenMetrics V2] Sample of endpoint regeneration time stats
Shown as second
cilium.endpoint.regenerations.total
(count)
[OpenMetrics V1] Count of completed endpoint regenerations
Shown as unit
cilium.endpoint.regenerations.count
(count)
[OpenMetrics V2] Count of completed endpoint regenerations
Shown as unit
cilium.endpoint.state
(gauge)
[OpenMetrics V1 and V2] Count of all endpoints
Shown as unit
cilium.errors_warning.total
(count)
[OpenMetrics V1] Total error warnings
Shown as error
cilium.errors_warning.count
(count)
[OpenMetrics V2] Total error warnings
Shown as error
cilium.event_timestamp
(gauge)
[OpenMetrics V1 and V2] Last timestamp of event received
Shown as time
cilium.forward_bytes.total
(count)
[OpenMetrics V1] Total forwarded bytes
Shown as byte
cilium.forward_bytes.count
(count)
[OpenMetrics V2] Total forwarded bytes
Shown as byte
cilium.forward_count.total
(count)
[OpenMetrics V1] Total forwarded packets
Shown as packet
cilium.forward_count.count
(count)
[OpenMetrics V2] Total forwarded packets
Shown as packet
cilium.fqdn.gc_deletions.total
(count)
[OpenMetrics V1] Total number of FQDNs cleaned in FQDN garbage collector job.
Shown as event
cilium.fqdn.gc_deletions.count
(count)
[OpenMetrics V2] Total number of FQDNs cleaned in FQDN garbage collector job.
Shown as event
cilium.fqdn.active_names
(gauge)
Number of domains inside the DNS cache that have not expired (by TTL), per endpoint. Available in Cilium v1.12+
Shown as item
cilium.fqdn.active_ips
(gauge)
Number of IPs inside the DNS cache associated with a domain that has not expired (by TTL), per endpoint. Available in Cilium v1.12+
Shown as item
cilium.fqdn.alive_zombie_connections
(gauge)
Number of IPs associated with domains that have expired (by TTL) yet still associated with an active connection (aka zombie), per endpoint. Available in Cilium v1.12+
Shown as item
cilium.ipcache.errors.total
(count)
[OpenMetrics V1] Number of errors interacting with the ipcache
Shown as error
cilium.ipcache.errors.count
(count)
[OpenMetrics V2] Number of errors interacting with the ipcache
Shown as error
cilium.ip_addresses.count
(gauge)
[OpenMetrics V1 and V2] Number of allocated ip_addresses
Shown as unit
cilium.ipam.events.total
(count)
[OpenMetrics V1] Number of IPAM events received by action and datapath family type
Shown as event
cilium.ipam.events.count
(count)
[OpenMetrics V2] Number of IPAM events received by action and datapath family type
Shown as event
cilium.k8s_client.api_latency_time.seconds.count
(count)
[OpenMetrics V1 and V2] Count of processed API call duration
Shown as request
cilium.k8s_client.api_latency_time.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of processed API call duration
Shown as second
cilium.k8s_client.api_latency_time.seconds.bucket
(count)
[OpenMetrics V2] Sample of processed API call duration
Shown as second
cilium.k8s_event.lag.seconds
(gauge)
[OpenMetrics V1 and v2] Lag for Kubernetes events - computed value between receiving a CNI ADD event from kubelet and a Pod event received from kube-api-server
Shown as second
cilium.k8s_terminating.endpoints_events.total
(count)
[OpenMetrics V1] Number of terminating endpoint events received from Kubernetes
Shown as event
cilium.k8s_terminating.endpoints_events.count
(count)
[OpenMetrics V2] Number of terminating endpoint events received from Kubernetes
Shown as event
cilium.kubernetes.events_received.total
(count)
[OpenMetrics V1] Number of Kubernetes received events processed
Shown as event
cilium.kubernetes.events_received.count
(count)
[OpenMetrics V2] Number of Kubernetes received events processed
Shown as event
cilium.kubernetes.events.total
(count)
[OpenMetrics V1] Number of Kubernetes events processed
Shown as event
cilium.kubernetes.events.count
(count)
[OpenMetrics V2] Number of Kubernetes events processed
Shown as event
cilium.nodes.all_datapath_validations.total
(count)
[OpenMetrics V1] Number of validation calls to implement the datapath implementation of a node
Shown as unit
cilium.nodes.all_datapath_validations.count
(count)
[OpenMetrics V2] Number of validation calls to implement the datapath implementation of a node
Shown as unit
cilium.nodes.all_events_received.total
(count)
[OpenMetrics V1] Number of node events received.
Shown as event
cilium.nodes.all_events_received.count
(count)
[OpenMetrics V2] Number of node events received
Shown as event
cilium.nodes.managed.total
(gauge)
[OpenMetrics V1 and V2] Number of nodes managed
Shown as node
cilium.endpoint.count
(gauge)
[OpenMetrics V1 and V2] Total ready endpoints managed by agent
Shown as unit
cilium.identity.count
(gauge)
[OpenMetrics V1 and V2] Number of identities allocate.
Shown as unit
cilium.policy.count
(gauge)
[OpenMetrics V1 and V2] Number of policies currently loaded
Shown as unit
cilium.policy.implementation_delay.bucket
(count)
[OpenMetrics V1 and V2] Time between a policy change and it being fully deployed into the datapath
Shown as second
cilium.policy.implementation_delay.count
(count)
[OpenMetrics V1 and V2] Time between a policy change and it being fully deployed into the datapath
Shown as unit
cilium.policy.implementation_delay.sum
(count)
[OpenMetrics V1 and V2] Time between a policy change and it being fully deployed into the datapath
Shown as second
cilium.policy.import_errors.count
(count)
[OpenMetrics V1 and V2] Number of failed policy imports
Shown as error
cilium.policy.endpoint_enforcement_status
(gauge)
[OpenMetrics V1 and V2] Number of endpoints labeled by policy enforcement status
Shown as unit
cilium.policy.max_revision
(gauge)
[OpenMetrics V1 and V2] Highest policy revision number in the agent
Shown as unit
cilium.policy.regeneration_time_stats.seconds.count
(count)
[OpenMetrics V1 and V2] Policy regeneration time stats count
Shown as operation
cilium.policy.regeneration_time_stats.seconds.sum
(count)
[OpenMetrics V1 and V2] Policy regeneration time stats count
Shown as second
cilium.policy.regeneration_time_stats.seconds.bucket
(count)
[OpenMetrics V2] Policy regeneration time stats sample
Shown as second
cilium.policy.regeneration.total
(count)
[OpenMetrics V1] Total number of successful policy regenerations
Shown as unit
cilium.policy.regeneration.count
(count)
[OpenMetrics V2] Total number of successful policy regenerations
Shown as unit
cilium.process.cpu.seconds.total
(gauge)
[OpenMetrics V1] Process CPU time in seconds
Shown as second
cilium.process.cpu.seconds.count
(count)
[OpenMetrics V2] Process CPU time in seconds
Shown as second
cilium.process.max_fds
(gauge)
[OpenMetrics V1 and V2] Process file descriptor maximum
Shown as file
cilium.process.open_fds
(gauge)
[OpenMetrics V1 and V2] Number of open file descriptors
Shown as file
cilium.process.resident_memory.bytes
(gauge)
[OpenMetrics V1 and V2] Total resident memory bytes
Shown as byte
cilium.process.start_time.seconds
(gauge)
[OpenMetrics V1 and V2] Processes start time
Shown as second
cilium.process.virtual_memory.bytes
(gauge)
[OpenMetrics V1 and V2] Virtual memory bytes
Shown as byte
cilium.process.virtual_memory.max.bytes
(gauge)
[OpenMetrics V1 and V2] Maximum virtual memory bytes
Shown as byte
cilium.subprocess.start.total
(count)
[OpenMetrics V1] Number of times that Cilium has started a subprocess
Shown as unit
cilium.subprocess.start.count
(count)
[OpenMetrics V2] Number of times that Cilium has started a subprocess
Shown as unit
cilium.triggers_policy.update_call_duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of policy update trigger duration
Shown as operation
cilium.triggers_policy.update_call_duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of policy update trigger duration
Shown as second
cilium.triggers_policy.update_call_duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of policy update trigger duration
Shown as second
cilium.triggers_policy.update_folds
(gauge)
[OpenMetrics V1 and V2] Number of folds
Shown as unit
cilium.triggers_policy.update.total
(count)
[OpenMetrics V1] Total number of policy update trigger invocations
Shown as unit
cilium.triggers_policy.update.count
(count)
[OpenMetrics V2] Total number of policy update trigger invocations
Shown as unit
cilium.unreachable.health_endpoints
(gauge)
[OpenMetrics V1 and V2] Number of health endpoints that cannot be reached
Shown as unit
cilium.unreachable.nodes
(gauge)
[OpenMetrics V1 and V2] Number of nodes that cannot be reached
Shown as node
cilium.kvstore.operations_duration.seconds.count
(count)
[OpenMetrics V1 and V2] Duration of kvstore operation count
Shown as operation
cilium.kvstore.operations_duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Duration of kvstore operation sum
Shown as second
cilium.kvstore.operations_duration.seconds.bucket
(count)
[OpenMetrics V2] Duration of kvstore operation sample
Shown as second
cilium.kvstore.events_queue.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration in seconds of received event was blocked before it could be queued
cilium.kvstore.events_queue.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration in seconds received event was blocked before it could be queued
Shown as second
cilium.kvstore.events_queue.seconds.bucket
(count)
[OpenMetrics V2] Sum of duration in seconds received event was blocked before it could be queued
Shown as second
cilium.kvstore.quorum_errors.total
(count)
[OpenMetrics V1] Number of quorum errors
Shown as error
cilium.kvstore.quorum_errors.count
(count)
[OpenMetrics V2] Number of quorum errors
Shown as error
cilium.kvstore.sync_queue_size
(gauge)
Number of elements queued for synchronization in the kvstore
Shown as item
cilium.kvstore.initial_sync_completed
(gauge)
Whether the initial synchronization from/to the kvstore has completed
cilium.proxy.redirects
(gauge)
Number of redirects installed for endpoints by protocol
cilium.proxy.upstream_reply.seconds.count
(count)
[OpenMetrics V1 and V2] Seconds waited for upstream server to reply to a request labeled by error, protocol and span time
Shown as second
cilium.proxy.upstream_reply.seconds.sum
(count)
[OpenMetrics V1 and V2] Seconds waited for upstream server to reply to a request labeled by error, protocol and span time
Shown as second
cilium.proxy.upstream_reply.seconds.bucket
(count)
[OpenMetrics V2] Seconds waited for upstream server to reply to a request labeled by error, protocol and span time
Shown as second
cilium.proxy.datapath.update_timeout.total
(count)
[OpenMetrics V1] Number of total datapath update timeouts due to FQDN IP updates. Available in Cilium 1.10+
Shown as timeout
cilium.proxy.datapath.update_timeout.count
(count)
[OpenMetrics V2] Number of total datapath update timeouts due to FQDN IP updates. Available in Cilium 1.10+
Shown as timeout
cilium.policy.l7.total
(count)
[OpenMetrics V1] Number of total L7 requests/responses by type
Shown as unit
cilium.policy.l7.count
(count)
[OpenMetrics V2] Number of total L7 requests/responses by type
Shown as unit
cilium.policy.l7_denied.total
(count)
[OpenMetrics V1] Number of total L7 denied requests/responses due to policy. Available in Cilium <= v1.7
Shown as unit
cilium.policy.l7_denied.count
(count)
[OpenMetrics V2] Number of total L7 denied requests/responses due to policy. Available in Cilium <= v1.7
Shown as unit
cilium.policy.l7_forwarded.total
(count)
[OpenMetrics V1] Number of total L7 forwarded requests/responses. Available in Cilium <= v1.7
Shown as unit
cilium.policy.l7_forwarded.count
(count)
[OpenMetrics V2] Number of total L7 forwarded requests/response. Available in Cilium <= v1.7
Shown as unit
cilium.policy.l7_parse_errors.total
(count)
[OpenMetrics V1] Number of total L7 parse errors. Available in Cilium <= v1.7
Shown as error
cilium.policy.l7_parse_errors.count
(count)
[OpenMetrics V2] Number of total L7 parse errors. Available in Cilium <= v1.7
Shown as error
cilium.policy.l7_received.total
(count)
[OpenMetrics V1] Number of total L7 received requests/responses. Available in Cilium <= v1.7
Shown as unit
cilium.policy.l7_received.count
(count)
[OpenMetrics V2] Number of total L7 received requests/responses. Available in Cilium <= v1.7
Shown as unit
cilium.datapath.errors.total
(count)
[OpenMetrics V1] Total number of errors in datapath management. Available in Cilium <= v1.9
Shown as error
cilium.datapath.errors.count
(count)
[OpenMetrics V2] Total number of errors in datapath management. Available in Cilium <= v1.9
Shown as error
cilium.k8s_client.api_calls.count
(count)
[OpenMetrics V1 and V2] Number of API calls made to kube-apiserver. Available in Cilium v1.10+
Shown as request
cilium.operator.process.cpu.seconds
(count)
[OpenMetrics V1] Total user and system CPU time spent in seconds
Shown as second
cilium.operator.process.cpu.seconds.count
(count)
[OpenMetrics V2] Total user and system CPU time spent in seconds
Shown as second
cilium.operator.process.max_fds
(gauge)
[OpenMetrics V1 and V2] Maximum number of open file descriptors
Shown as file
cilium.operator.process.open_fds
(gauge)
[OpenMetrics V1 and V2] Number of open file descriptors
Shown as file
cilium.operator.process.resident_memory.bytes
(gauge)
[OpenMetrics V1 and V2] Resident memory size in bytes
Shown as byte
cilium.operator.process.start_time.seconds
(gauge)
[OpenMetrics V1 and V2] Start time of the process since unix epoch in seconds
Shown as second
cilium.operator.process.virtual_memory.bytes
(gauge)
[OpenMetrics V1 and V2] Virtual memory size in bytes
Shown as byte
cilium.operator.process.virtual_memory_max.bytes
(gauge)
[OpenMetrics V1 and V2] Maximum amount of virtual memory available in bytes
Shown as byte
cilium.operator.eni.deficit_resolver.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of deficit resolver trigger runs
Shown as operation
cilium.operator.eni.deficit_resolver.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of deficit resolver trigger runs
Shown as second
cilium.operator.eni.deficit_resolver.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of deficit resolver trigger runs
Shown as second
cilium.operator.eni.deficit_resolver.folds
(gauge)
[OpenMetrics V1 and V2] Current level of deficit resolver folding
Shown as unit
cilium.operator.eni.deficit_resolver.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of latency between deficit resolver queue and trigger run
Shown as operation
cilium.operator.eni.deficit_resolver.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of latency between deficit resolver queue and trigger run
Shown as second
cilium.operator.eni.deficit_resolver.latency.seconds.bucket
(count)
[OpenMetrics V2] Sample of latency between deficit resolver queue and trigger run
Shown as second
cilium.operator.eni.deficit_resolver.queued.total
(gauge)
[OpenMetrics V1] Number of queued deficit resolver triggers
Shown as event
cilium.operator.eni.deficit_resolver.queued.count
(count)
[OpenMetrics V2] Number of queued deficit resolver triggers
Shown as event
cilium.operator.eni.available
(gauge)
[OpenMetrics V2] Number of available IPs per subnet ID. Available in Cilium <= v1.8
Shown as unit
cilium.operator.eni.available.ips_per_subnet
(gauge)
[OpenMetrics V1 and V2] Number of available IPs per subnet ID. Available in Cilium <= v1.8
Shown as unit
cilium.operator.eni.aws_api_duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of interactions with AWS API. Available in Cilium <= v1.8
Shown as request
cilium.operator.eni.aws_api_duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of interactions with AWS API. Available in Cilium <= v1.8
Shown as second
cilium.operator.eni.aws_api_duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of interactions with AWS API. Available in Cilium <= v1.8
Shown as second
cilium.operator.eni_ec2.rate_limit.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of client-side rate limiter blocking. Available in Cilium <= v1.9
Shown as request
cilium.operator.eni_ec2.rate_limit.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of client-side rate limiter blocking. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni_ec2.rate_limit.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of client-side rate limiter blocking. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.ec2_resync.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of ec2 resync trigger runs. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.ec2_resync.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of ec2 resync trigger runs. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.ec2_resync.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of ec2 resync trigger runs. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.ec2_resync.folds
(gauge)
[OpenMetrics V1 and V2] Current level of ec2 resync folding. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.ec2_resync.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of latency between ec2 resync queue and trigger run. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.ec2_resync.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of latency between ec2 resync queue and trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.ec2_resync.latency.seconds.bucket
(count)
[OpenMetrics V2] Sample of latency between ec2 resync queue and trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.ec2_resync.queued.total
(gauge)
[OpenMetrics V1] Number of queued ec2 resync triggers. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.ec2_resync.queued.count
(count)
[OpenMetrics V2] Number of queued ec2 resync triggers. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.interface_creation_ops
(count)
[OpenMetrics V1] Number of ENIs allocated. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.interface_creation_ops.count
(count)
[OpenMetrics V2] Number of ENIs allocated. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.ips.total
(gauge)
[OpenMetrics V1 and V2] Number of IPs allocated. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.k8s_sync.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of k8s sync trigger run. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.k8s_sync.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of k8s sync trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.k8s_sync.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of k8s sync trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.k8s_sync.folds
(gauge)
[OpenMetrics V1 and V2] Current level of k8s sync folding. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.k8s_sync.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of k8s sync latency between queue and trigger run. Available in Cilium <= v1.9
Shown as operation
cilium.operator.eni.k8s_sync.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of k8s sync latency between queue and trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.k8s_sync.latency.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of k8s sync latency between queue and trigger run. Available in Cilium <= v1.9
Shown as second
cilium.operator.eni.k8s_sync.queued.total
(gauge)
[OpenMetrics V1] Number of queued k8s sync triggers. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.k8s_sync.queued.count
(count)
[OpenMetrics V2] Number of queued k8s sync triggers. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.nodes.total
(gauge)
[OpenMetrics V1] Number of nodes by category. Available in Cilium <= v1.9
Shown as node
cilium.operator.eni.resync.total
(count)
[OpenMetrics V1] Number of resync operations to synchronize AWS EC2 metadata. Available in Cilium <= v1.9
Shown as unit
cilium.operator.eni.resync.count
(count)
[OpenMetrics V2] Number of resync operations to synchronize AWS EC2 metadata. Available in Cilium <= v1.9
Shown as unit
cilium.operator.ipam.ips
(gauge)
[OpenMetrics V1 and V2] Number of IPs allocated. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.allocation_ops
(count)
[OpenMetrics V1] Count of IP allocation operations. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.allocation_ops.count
(count)
[OpenMetrics V2] Count of IP allocation operations. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.release_ops
(count)
[OpenMetrics V1] Count of IP release operations. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.release_ops.count
(count)
[OpenMetrics V2] Count of IP release operations. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.interface_creation_ops
(count)
[OpenMetrics V1] Count of interfaces allocated. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.interface_creation_ops.count
(count)
[OpenMetrics V2] Count of interfaces allocated. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.available
(gauge)
[OpenMetrics V1 and V2] Number of interfaces with addresses available. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.available.ips_per_subnet
(gauge)
[OpenMetrics V1 and V2] Number of available IPs per subnet ID. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.deficit_resolver.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of deficit resolver trigger runs. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.deficit_resolver.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of deficit resolver trigger runs. Available in Cilium v1.8+
Shown as request
cilium.operator.ipam.deficit_resolver.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of deficit resolver trigger runs. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.deficit_resolver.queued.count
(count)
[OpenMetrics V2] Number of queued triggers. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.deficit_resolver.queued.total
(count)
[OpenMetrics V1] Number of queued triggers. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.deficit_resolver.folds
(gauge)
[OpenMetrics V1 and V2] Current level of deficit resolver folding. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.deficit_resolver.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of deficit resolver latency between queue and trigger run. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.deficit_resolver.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of deficit resolver latency between queue and trigger run. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.deficit_resolver.latency.seconds.bucket
(count)
[OpenMetrics V2] Sample of deficit resolver latency between queue and trigger run. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.k8s_sync.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of K8s sync trigger runs. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.k8s_sync.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of K8s sync trigger runs. Available in Cilium v1.8+
Shown as request
cilium.operator.ipam.k8s_sync.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of K8s sync trigger runs. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.k8s_sync.queued.total
(count)
[OpenMetrics V1] Number of queued k8s sync triggers. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.k8s_sync.queued.count
(count)
[OpenMetrics V2] Number of queued k8s sync triggers. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.k8s_sync.folds
(gauge)
[OpenMetrics V1 and V2] Current level of K8s sync folding. Available in Cilium v1.8+
Shown as unit
cilium.operator.ipam.k8s_sync.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of K8s sync latency between queue and trigger run. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.k8s_sync.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of K8s sync latency between queue and trigger run. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.k8s_sync.latency.seconds.bucket
(count)
[OpenMetrics V2] Count of K8s sync latency between queue and trigger run. Available in Cilium v1.8+
Shown as second
cilium.operator.ipam.nodes
(gauge)
[OpenMetrics V1 and V2] Number of nodes by category. Available in Cilium v1.8+
Shown as node
cilium.operator.ipam.api.duration.seconds.sum
(count)
[OpenMetricsV1 and V2] Duration of interactions with external IPAM API. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.api.duration.seconds.count
(count)
[OpenMetricsV1 and V2] Duration of interactions with external IPAM API. Available in Cilium v1.9+
Shown as request
cilium.operator.ipam.api.duration.seconds.bucket
(count)
[OpenMetricsV2] Duration of interactions with external IPAM API. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.api.rate_limit.duration.seconds.sum
(count)
[OpenMetricsV1 and V2] Sum of duration of rate limiting while accessing external IPAM API. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.api.rate_limit.duration.seconds.count
(count)
[OpenMetricsV1 and V2] Count of duration of rate limiting while accessing external IPAM API. Available in Cilium v1.9+
Shown as request
cilium.operator.ipam.api.rate_limit.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of rate limiting while accessing external IPAM API. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.resync.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of resync trigger runs. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.resync.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of resync trigger runs. Available in Cilium v1.9+
Shown as request
cilium.operator.ipam.resync.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of resync trigger runs. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.resync.total
(count)
[OpenMetrics V1] Number of resync operations to synchronize and resolve IP deficit of nodes. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.resync.count
(count)
[OpenMetrics V2] Number of resync operations to synchronize and resolve IP deficit of nodes. Available in Cilium v1.8+
Shown as operation
cilium.operator.ipam.resync.queued.total
(count)
[OpenMetrics V1] Number of IPAM queued triggers. Available in Cilium v1.9+
Shown as unit
cilium.operator.ipam.resync.queued.count
(count)
[OpenMetrics V2] Number of IPAM queued triggers. Available in Cilium v1.9+
Shown as unit
cilium.operator.ipam.resync.folds
(gauge)
[OpenMetrics V1 and V2] Current level of resync folding. Available in Cilium v1.9+
Shown as unit
cilium.operator.ipam.resync.latency.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of resync latency between queue and trigger run. Available in Cilium v1.9+
Shown as second
cilium.operator.ipam.resync.latency.seconds.count
(count)
[OpenMetrics V1 and V2] Count of resync latency between queue and trigger run. Available in Cilium v1.9+
Shown as operation
cilium.operator.ipam.resync.latency.seconds.bucket
(count)
[OpenMetrics V2] Sample of resync latency between queue and trigger run. Available in Cilium v1.9+
Shown as second
cilium.operator.azure.api.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of interactions with the Azure API. Available in Cilium v1.9+
Shown as second
cilium.operator.azure.api.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of interactions with the Azure API. Available in Cilium v1.9+
Shown as request
cilium.operator.azure.api.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of interactions with the Azure API. Available in Cilium v1.9+
Shown as second
cilium.operator.azure.api.rate_limit.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of client-side rate limiter blocking when interacting with the Azure API. Available in Cilium v1.9+
Shown as second
cilium.operator.azure.api.rate_limit.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of client-side rate limiter blocking when interacting with the Azure API. Available in Cilium v1.9+
Shown as request
cilium.operator.azure.api.rate_limit.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of client-side rate limiter blocking when interacting with the Azure API. Available in Cilium v1.9+
Shown as second
cilium.operator.ec2.api.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of interactions with the AWS EC2 API. Available in Cilium v1.9+
Shown as second
cilium.operator.ec2.api.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of interactions with the AWS EC2 API. Available in Cilium v1.9+
Shown as request
cilium.operator.ec2.api.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of interactions with the AWS EC2 API. Available in Cilium v1.9+
Shown as second
cilium.operator.ec2.api.rate_limit.duration.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of duration of client-side rate limiter blocking when interacting with the AWS EC2 API. Available in Cilium v1.9+
Shown as second
cilium.operator.ec2.api.rate_limit.duration.seconds.count
(count)
[OpenMetrics V1 and V2] Count of duration of client-side rate limiter blocking when interacting with the AWS EC2 API. Available in Cilium v1.9+
Shown as request
cilium.operator.ec2.api.rate_limit.duration.seconds.bucket
(count)
[OpenMetrics V2] Sample of duration of client-side rate limiter blocking when interacting with the AWS EC2 API. Available in Cilium v1.9+
Shown as second
cilium.operator.ces.queueing_delay.seconds.sum
(count)
[OpenMetrics V1 and V2] Sum of CiliumEndpointSlice queueing delay in seconds. Available in Cilium v1.11+
Shown as second
cilium.operator.ces.queueing_delay.seconds.count
(count)
[OpenMetrics V1 and V2] Count of CiliumEndpointSlice queueing delay in seconds. Available in Cilium v1.11+
Shown as unit
cilium.operator.ces.queueing_delay.seconds.bucket
(count)
[OpenMetrics V2] Sample of CiliumEndpointSlice queueing delay in seconds. Available in Cilium v1.11+
Shown as second
cilium.operator.ces.sync_errors.total
(count)
[OpenMetrics V1] Number of CES sync errors. Available in Cilium v1.11+
Shown as error
cilium.operator.ces.sync_errors.count
(count)
[OpenMetrics V2] Number of CES sync errors. Available in Cilium v1.11+
Shown as error
cilium.operator.identity_gc.entries
(gauge)
[OpenMetrics V1 and V2] The number of alive and deleted identities at the end of a garbage collector run. Available in Cilium v1.11+
Shown as garbage collection
cilium.operator.identity_gc.runs
(gauge)
[OpenMetrics V1 and V2] The number of times identity garbage collector has run. Available in Cilium v1.11+
Shown as garbage collection
cilium.operator.num_ceps_per_ces.sum
(count)
[OpenMetrics V1 and V2] Sum of CEPs batched in a CES. Available in Cilium v1.11+
Shown as unit
cilium.operator.num_ceps_per_ces.count
(count)
[OpenMetrics V1 and V2] Count of CEPs batched in a CES. Available in Cilium v1.11+
Shown as unit
cilium.operator.num_ceps_per_ces.bucket
(count)
[OpenMetrics V2] Sample of CEPs batched in a CES. Available in Cilium v1.11+
Shown as unit

Checks de service

cilium.prometheus.health :
Renvoie CRITICAL si l’Agent ne parvient pas à se connecter aux endpoints de métriques. Si ce n’est pas le cas, renvoie OK.

Événements

Cilium n’inclut aucun événement.

Dépannage

Besoin d’aide ? Contactez l’assistance Datadog .