Argo Rollouts

Supported OS Linux Windows Mac OS

Versión de la integración2.2.0

Información general

Este check supervisa Argo Rollouts a través del Datadog Agent.

Configuración

Sigue las instrucciones a continuación para instalar y configurar este check para un Agent que se ejecuta en tu entorno de Kubernetes. Para más información sobre la configuración en entornos en contenedores, consulta las plantillas de integración de Autodiscovery para obtener orientación.

Instalación

A partir de la versión 7.53.0 del Agent, el check de Argo Rollouts se incluye en el paquete del Datadog Agent. No es necesaria ninguna instalación adicional en tu entorno.

Este check utiliza OpenMetrics para recopilar métricas desde el endpoint de OpenMetrics que expone Argo Rollouts, que requiere Python 3.

Configuración

El controlador de Argo Rollouts dispone de métricas con formato Prometheus en /metrics en el puerto 8090. Para que el Agent empiece a recopilar métricas, los pods de Argo Rollouts necesitan ser anotados. Para más información sobre anotaciones, consulta las plantillas de integración de Autodiscovery como guía. Puedes encontrar opciones adicionales de configuración en el argo_rollouts.d/conf.yaml de ejemplo.

Nota: Las métricas enumeradas sólo pueden recopilarse si están disponibles. Algunas métricas sólo se generan cuando se realizan determinadas acciones. Por ejemplo, la métrica argo_rollouts.info.replicas.updated sólo se expone tras la actualización de una réplica.

El único parámetro necesario para configurar el check de Argo Rollouts es:

  • openmetrics_endpoint: este parámetro debe establecerse en la localización donde se exponen las métricas con formato Prometheus. El puerto predeterminado es 8090. En entornos en contenedores, %%host%% debe utilizarse para la autodetección de hosts.
apiVersion: v1
kind: Pod
# (...)
metadata:
  name: '<POD_NAME>'
  annotations:
    ad.datadoghq.com/argo-rollouts.checks: |
      {
        "argo_rollouts": {
          "init_config": {},
          "instances": [
            {
              "openmetrics_endpoint": "http://%%host%%:8090/metrics"
            }
          ]
        }
      }      
    # (...)
spec:
  containers:
    - name: 'argo-rollouts'
# (...)

Recopilación de logs

Disponible para la versión 6.0 o posterior del Agent

Los logs de Argo Rollouts pueden recopilarse de los diferentes pods de Argo Rollouts a través de Kubernetes. La recopilación de logs está desactivada por defecto en el Datadog Agent. Para habilitarla, consulta recopilación de logs de Kubernetes.

Consulta las plantillas de integración de Autodiscovery para obtener orientación sobre la aplicación de los parámetros que se indican a continuación.

ParámetroValor
<LOG_CONFIG>{"source": "argo_rollouts", "service": "<SERVICE_NAME>"}

Validación

Ejecuta el subcomando de estado del Agent y busca argo_rollouts en la sección Checks.

Datos recopilados

Métricas

argo_rollouts.analysis.run.info
(gauge)
Information about analysis run
argo_rollouts.analysis.run.metric.phase
(gauge)
Information on the duration of a specific metric in the Analysis Run
argo_rollouts.analysis.run.metric.type
(gauge)
Information on the type of a specific metric in the Analysis Runs
argo_rollouts.analysis.run.phase
(gauge)
Information on the state of the Analysis Run
argo_rollouts.analysis.run.reconcile.bucket
(count)
The number of observations in the Analysis Run reconciliation performance histogram by upper_bound buckets
argo_rollouts.analysis.run.reconcile.count
(count)
The number of observations in the Analysis Run reconciliation performance histogram
argo_rollouts.analysis.run.reconcile.error.count
(count)
Error occurring during the analysis run
argo_rollouts.analysis.run.reconcile.sum
(count)
The duration sum of all observations in the Analysis Run reconciliation performance histogram
argo_rollouts.controller.clientset.k8s.request.count
(count)
The total number of Kubernetes requests executed during application reconciliation
argo_rollouts.experiment.info
(gauge)
Information about Experiment
argo_rollouts.experiment.phase
(gauge)
Information on the state of the experiment
argo_rollouts.experiment.reconcile.bucket
(count)
The number of observations in the Experiments reconciliation performance histogram by upper_bound buckets
argo_rollouts.experiment.reconcile.count
(count)
The number of observations in the Experiments reconciliation performance histogram
argo_rollouts.experiment.reconcile.error.count
(count)
Error occurring during the experiment
argo_rollouts.experiment.reconcile.sum
(count)
The duration sum of all observations in the Experiments reconciliation performance histogram
argo_rollouts.go.gc.duration.seconds.count
(count)
The summary count of garbage collection cycles in the Argo Rollouts instance
Shown as second
argo_rollouts.go.gc.duration.seconds.quantile
(gauge)
A summary of the pause duration of garbage collection cycles in the Argo Rollouts instance
Shown as second
argo_rollouts.go.gc.duration.seconds.sum
(count)
The sum of the pause duration of garbage collection cycles in the Argo Rollouts instance
Shown as second
argo_rollouts.go.goroutines
(gauge)
The number of goroutines that currently exist in the Argo Rollouts instance
argo_rollouts.go.info
(gauge)
Metric containing the Go version as a tag
argo_rollouts.go.memstats.alloc_bytes
(gauge)
The number of bytes allocated and still in use in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.memstats.alloc_bytes.count
(count)
The monotonic count of bytes allocated and still in use in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.memstats.buck_hash.sys_bytes
(gauge)
The number of bytes used by the profiling bucket hash table in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.memstats.frees.count
(count)
The total number of frees in the Argo Rollouts instance
argo_rollouts.go.memstats.gc.cpu_fraction
(gauge)
The fraction of this program's available CPU time used by the GC since the program started in the Argo Rollouts instance
Shown as fraction
argo_rollouts.go.memstats.gc.sys_bytes
(gauge)
The number of bytes used for garbage collection system metadata in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.memstats.heap.alloc_bytes
(gauge)
The number of heap bytes allocated and still in use in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.memstats.heap.idle_bytes
(gauge)
The number of heap bytes waiting to be used in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.memstats.heap.inuse_bytes
(gauge)
The number of heap bytes that are in use in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.memstats.heap.objects
(gauge)
The number of allocated objects in the Argo Rollouts instance
Shown as object
argo_rollouts.go.memstats.heap.released_bytes
(gauge)
The number of heap bytes released to the OS in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.memstats.heap.sys_bytes
(gauge)
The number of heap bytes obtained from system in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.memstats.lookups.count
(count)
The number of pointer lookups
argo_rollouts.go.memstats.mallocs.count
(count)
The number of mallocs
argo_rollouts.go.memstats.mcache.inuse_bytes
(gauge)
The number of bytes in use by mcache structures in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.memstats.mcache.sys_bytes
(gauge)
The number of bytes used for mcache structures obtained from system in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.memstats.mspan.inuse_bytes
(gauge)
The number of bytes in use by mspan structures in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.memstats.mspan.sys_bytes
(gauge)
The number of bytes used for mspan structures obtained from system in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.memstats.next.gc_bytes
(gauge)
The number of heap bytes when next garbage collection takes place in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.memstats.other.sys_bytes
(gauge)
The number of bytes used for other system allocations in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.memstats.stack.inuse_bytes
(gauge)
The number of bytes in use by the stack allocator in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.memstats.stack.sys_bytes
(gauge)
The number of bytes obtained from system for stack allocator in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.memstats.sys_bytes
(gauge)
The number of bytes obtained from system in the Argo Rollouts instance
Shown as byte
argo_rollouts.go.threads
(gauge)
The number of OS threads created in the Argo Rollouts instance
Shown as thread
argo_rollouts.notification.send.bucket
(count)
The number of observations in the Notification send performance histogram by upper_bound buckets
argo_rollouts.notification.send.count
(count)
The number of observations in the Notification send performance histogram
argo_rollouts.notification.send.sum
(count)
The duration sum of all observations in the Notification send performance histogram
argo_rollouts.process.cpu.seconds.count
(count)
The total user and system CPU time spent in seconds in the Argo Rollouts instance
Shown as second
argo_rollouts.process.max_fds
(gauge)
The maximum number of open file descriptors in the Argo Rollouts instance
argo_rollouts.process.open_fds
(gauge)
The number of open file descriptors in the Argo Rollouts instance
argo_rollouts.process.resident_memory.bytes
(gauge)
The resident memory size in bytes in the Argo Rollouts instance
Shown as byte
argo_rollouts.process.start_time.seconds
(gauge)
The start time of the process since unix epoch in seconds in the Argo Rollouts instance
Shown as second
argo_rollouts.process.virtual_memory.bytes
(gauge)
The virtual memory size in bytes in the Argo Rollouts instance
Shown as byte
argo_rollouts.process.virtual_memory.max_bytes
(gauge)
The maximum amount of virtual memory available in bytes in the Argo Rollouts instance
Shown as byte
argo_rollouts.rollout.events.count
(count)
The count of rollout events
argo_rollouts.rollout.info
(gauge)
Information about rollout
argo_rollouts.rollout.info.replicas.available
(gauge)
The number of available replicas per rollout
argo_rollouts.rollout.info.replicas.desired
(gauge)
The number of desired replicas per rollout
argo_rollouts.rollout.info.replicas.unavailable
(gauge)
The number of unavailable replicas per rollout
argo_rollouts.rollout.info.replicas.updated
(gauge)
The number of updated replicas per rollout
argo_rollouts.rollout.phase
(gauge)
Information on the state of the rollout. This will be soon to be deprecated by Argo Rollouts, use argo_rollouts.rollout.info instead
argo_rollouts.rollout.reconcile.bucket
(count)
The number of observations in the Rollout reconciliation performance histogram by upper_bound buckets
argo_rollouts.rollout.reconcile.count
(count)
The number of observations in the Rollout reconciliation performance histogram
argo_rollouts.rollout.reconcile.error.count
(count)
Error occurring during the rollout
argo_rollouts.rollout.reconcile.sum
(count)
The duration sum of all observations in the Rollout reconciliation performance histogram
argo_rollouts.workqueue.adds.count
(count)
The total number of adds handled by workqueue
argo_rollouts.workqueue.depth
(gauge)
The current depth of the workqueue
argo_rollouts.workqueue.longest.running_processor.seconds
(gauge)
The number of seconds the longest running worqueue processor has been running
Shown as second
argo_rollouts.workqueue.queue.duration.seconds.bucket
(count)
The histogram bucket of how long in seconds an item stays in the workqueue before being requested
Shown as second
argo_rollouts.workqueue.queue.duration.seconds.count
(count)
The total number of events in the workqueue duration histogram
argo_rollouts.workqueue.queue.duration.seconds.sum
(count)
The sum the of events counted in the workqueue duration histogram
argo_rollouts.workqueue.retries.count
(count)
The total number of retries handled by workqueue
argo_rollouts.workqueue.unfinished_work.seconds
(gauge)
The number of seconds of work that has been done that is in progress and hasn't been observed by work_duration. Large values indicate stuck threads. One can deduce the number of stuck threads by observing the rate at which this increases
Shown as second
argo_rollouts.workqueue.work.duration.seconds.bucket
(count)
The histogram bucket for time in seconds it takes for processing of an item in the workqueue
Shown as second
argo_rollouts.workqueue.work.duration.seconds.count
(count)
The total number of events in the workqueue item processing duration histogram
argo_rollouts.workqueue.work.duration.seconds.sum
(count)
The sum of events in the workqueue item processing duration histogram

Eventos

La integración de Argo Rollouts no incluye ningún evento.

Checks de servicio

argo_rollouts.openmetrics.health
Returns CRITICAL if the Agent is unable to connect to the Argo Rollouts OpenMetrics endpoint, otherwise returns OK.
Statuses: ok, critical

Solucionar problemas

¿Necesitas ayuda? Ponte en contacto con el equipo de asistencia de Datadog.

Referencias adicionales

Más enlaces, artículos y documentación útiles: