Supported OS Linux Windows Mac OS

Versión de la integración4.0.0

Consul Dash

Información general

El Datadog Agent recopila muchas métricas de nodos de Consul, incluidas las de:

  • Total de pares de Consul
  • Estado del servicio: para un servicio determinado, ¿cuántos de sus nodos están activos, en transmisión, en advertencia o en estado crítico?
  • Estado del nodo: para un nodo determinado, ¿cuántos de sus servicios están activos, en transmisión, en advertencia o en estado crítico?
  • Coordenadas de red: latencias entre centros de datos y dentro de ellos

El Consul Agent puede proporcionar métricas adicionales con DogStatsD. Estas métricas están más relacionadas con el estado interno de Consul en sí, no con los servicios que dependen de Consul. Hay métricas para:

  • Eventos serf y flaps de miembros
  • Protocolo Raft
  • Rendimiento del DNS

Y mucho más.

Por último, además de las métricas, el Datadog Agent también envía un check de servicio para cada uno de los checks de estado de Consul y un evento después de cada nueva elección de líder.

Configuración

Instalación

El check de Consul está incluido en el paquete del Datadog Agent, por lo que no necesitas instalar nada más en tus nodos de Consul.

Configuración

Host

Para configurar este check para un Agent que se ejecuta en un host:

Recopilación de métricas
  1. Edita el archivo consul.d/conf.yaml en la carpeta conf.d/ en la raíz del directorio de configuración de tu Agent para empezar a recopilar tus métricas de Consul. Para ver todas las opciones disponibles de configuración, consulta el ejemplo consul.d/conf.yaml.

    init_config:
    
    instances:
      ## @param url - string - required
      ## Where your Consul HTTP server lives,
      ## point the URL at the leader to get metrics about your Consul cluster.
      ## Use HTTPS instead of HTTP if your Consul setup is configured to do so.
      #
      - url: http://localhost:8500
    
  2. Reinicia el Agent.

OpenMetrics

Opcionalmente, puedes habilitar la opción de configuración use_prometheus_endpoint para obtener un conjunto adicional de métricas del endpoint de Prometheus para Consul.

Nota: Utiliza el método DogStatsD o Prometheus; no habilites ambos para la misma instancia.

  1. Configura Consul para exponer métricas al endpoint de Prometheus. Establece prometheus_retention_time anidado bajo la clave telemetry de nivel superior del archivo de configuración principal de Consul:

    {
      ...
      "telemetry": {
        "prometheus_retention_time": "360h"
      },
      ...
    }
    
  2. Edita el archivo consul.d/conf.yaml, en la carpeta conf.d/ en la raíz de tu directorio de configuración del Agent para empezar a usar el endpoint de Prometheus.

    instances:
        - url: <EXAMPLE>
          use_prometheus_endpoint: true
    
  3. Reinicia el Agent.

DogStatsD

En lugar de utilizar el endpoint de Prometheus, puedes configurar Consul para enviar el mismo conjunto de métricas adicionales al Agent a través de DogStatsD.

  1. Configura Consul para enviar métricas de DogStatsD añadiendo dogstatsd_addr anidado bajo la clave telemetry de nivel superior en el archivo de configuración principal de Consul:

    {
      ...
      "telemetry": {
        "dogstatsd_addr": "127.0.0.1:8125"
      },
      ...
    }
    
  2. Actualiza el archivo de configuración principal del Datadog Agent datadog.yaml añadiendo las siguientes configuraciones para garantizar que las métricas estén etiquetadas correctamente:

    # dogstatsd_mapper_cache_size: 1000  # default to 1000
    dogstatsd_mapper_profiles:
      - name: consul
        prefix: "consul."
        mappings:
          - match: 'consul\.http\.([a-zA-Z]+)\.(.*)'
            match_type: "regex"
            name: "consul.http.request"
            tags:
              method: "$1"
              path: "$2"
          - match: 'consul\.raft\.replication\.appendEntries\.logs\.([0-9a-f-]+)'
            match_type: "regex"
            name: "consul.raft.replication.appendEntries.logs"
            tags:
              peer_id: "$1"
          - match: 'consul\.raft\.replication\.appendEntries\.rpc\.([0-9a-f-]+)'
            match_type: "regex"
            name: "consul.raft.replication.appendEntries.rpc"
            tags:
              peer_id: "$1"
          - match: 'consul\.raft\.replication\.heartbeat\.([0-9a-f-]+)'
            match_type: "regex"
            name: "consul.raft.replication.heartbeat"
            tags:
              peer_id: "$1"
    
  3. Reinicia el Agent.

Recopilación de logs

Disponible para la versión 6.0 o posteriores del Agent

  1. La recopilación de logs se encuentra deshabilitada de manera predeterminada en el Datadog Agent. Habilítala en tu archivo datadog.yaml con:

    logs_enabled: true
    
  2. Edita este bloque de configuración en tu archivo consul.yaml para recopilar logs de Consul:

    logs:
      - type: file
        path: /var/log/consul_server.log
        source: consul
        service: myservice
    

    Cambia los valores de los parámetros path y service y configúralos para tu entorno. Para conocer todas las opciones de configuración disponibles, consulta el consul.d/conf.yaml de ejemplo.

  3. Reinicia el Agent.

Contenedores

Para entornos en contenedores, consulta las plantillas de integración de Autodiscovery para obtener orientación sobre la aplicación de los parámetros que se indican a continuación.

Recopilación de métricas
ParámetroValor
<INTEGRATION_NAME>consul
<INIT_CONFIG>en blanco o {}
<INSTANCE_CONFIG>{"url": "https://%%host%%:8500"}
Recopilación de logs

Disponible para la versión 6.0 o posteriores del Agent

La recopilación de logs se encuentra deshabilitada de manera predeterminada en el Datadog Agent. Para habilitarla, consulta Recopilación de logs de Kubernetes.

ParámetroValor
<LOG_CONFIG>{"source": "consul", "service": "<SERVICE_NAME>"}

Validación

Ejecuta el subcomando de estado del Agent y busca consul en la sección Checks.

Nota: Si tus nodos de Consul tienen habilitado el registro de depuración, el sondeo regular del Datadog Agent se muestra en el log de Consul:

2017/03/27 21:38:12 [DEBUG] http: Request GET /v1/status/leader (59.344us) from=127.0.0.1:53768
2017/03/27 21:38:12 [DEBUG] http: Request GET /v1/status/peers (62.678us) from=127.0.0.1:53770
2017/03/27 21:38:12 [DEBUG] http: Request GET /v1/health/state/any (106.725us) from=127.0.0.1:53772
2017/03/27 21:38:12 [DEBUG] http: Request GET /v1/catalog/services (79.657us) from=127.0.0.1:53774
2017/03/27 21:38:12 [DEBUG] http: Request GET /v1/health/service/consul (153.917us) from=127.0.0.1:53776
2017/03/27 21:38:12 [DEBUG] http: Request GET /v1/coordinate/datacenters (71.778us) from=127.0.0.1:53778
2017/03/27 21:38:12 [DEBUG] http: Request GET /v1/coordinate/nodes (84.95us) from=127.0.0.1:53780

Consul Agent a DogStatsD

Usa netstat para verificar que Consul también esté enviando sus métricas:

$ sudo netstat -nup | grep "127.0.0.1:8125.*ESTABLISHED"
udp        0      0 127.0.0.1:53874         127.0.0.1:8125          ESTABLISHED 23176/consul

Datos recopilados

Métricas

consul.catalog.nodes_critical
(gauge)
[Integration] The number of nodes with service status critical from those registered
Shown as node
consul.catalog.nodes_passing
(gauge)
[Integration] The number of nodes with service status passing from those registered
Shown as node
consul.catalog.nodes_up
(gauge)
[Integration] The number of nodes
Shown as node
consul.catalog.nodes_warning
(gauge)
[Integration] The number of nodes with service status warning from those registered
Shown as node
consul.catalog.services_count
(gauge)
[Integration] Metrics to count the number of services matching criteria like the service tag, node name, or status. To be queried using the sum by aggregator.
Shown as service
consul.catalog.services_critical
(gauge)
[Integration] Total critical services on nodes
Shown as service
consul.catalog.services_passing
(gauge)
[Integration] Total passing services on nodes
Shown as service
consul.catalog.services_up
(gauge)
[Integration] Total services registered on nodes
Shown as service
consul.catalog.services_warning
(gauge)
[Integration] Total warning services on nodes
Shown as service
consul.catalog.total_nodes
(gauge)
[Integration] The number of nodes registered in the consul cluster
Shown as node
consul.client.rpc
(count)
[DogStatsD] [Prometheus] This increments whenever a Consul agent in client mode makes an RPC request to a Consul server. This gives a measure of how much a given agent is loading the Consul servers. This is only generated by agents in client mode, not Consul servers.
Shown as request
consul.client.rpc.failed
(count)
[DogStatsD] [Prometheus] Increments whenever a Consul agent in client mode makes an RPC request to a Consul server and fails
Shown as request
consul.http.request
(gauge)
[DogStatsD] Tracks how long it takes to service the given HTTP request for the given verb and path. Using a DogStatsD mapper as described in the README, the paths are mapped to tags and do not include details like service or key names. For these paths, an underscore is present as a placeholder, for example: http_method:GET, path:v1.kv._)
Shown as millisecond
consul.http.request.count
(count)
[Prometheus] A count of how long it takes to service the given HTTP request for the given verb and path. It includes labels for path and method. Path does not include details like service or key names. For these paths, an underscore is present as a placeholder, for example: path=v1.kv._)
Shown as millisecond
consul.http.request.quantile
(gauge)
[Prometheus] A quantile of how long it takes to service the given HTTP request for the given verb and path. Includes labels for path and method. Path does not include details like service or key names. For these paths, an underscore is present as a placeholder, for example: path=v1.kv._)
Shown as millisecond
consul.http.request.sum
(count)
[Prometheus] The sum of how long it takes to service the given HTTP request for the given verb and path. Includes labels for path and method. Path does not include details like service or key names. For these paths, an underscore is present as a placeholder, for example: path=v1.kv._)
Shown as millisecond
consul.memberlist.degraded.probe
(gauge)
[DogStatsD] [Prometheus] This metric counts the number of times the Consul agent has performed failure detection on another agent at a slower probe rate. The agent uses its own health metric as an indicator to perform this action. If its health score is low, it means that the node is healthy, and vice versa.
consul.memberlist.gossip.95percentile
(gauge)
[DogStatsD] The p95 for the number of gossips (messages) broadcasted to a set of randomly selected nodes.
Shown as message
consul.memberlist.gossip.avg
(gauge)
[DogStatsD] The avg for the number of gossips (messages) broadcasted to a set of randomly selected nodes.
Shown as message
consul.memberlist.gossip.count
(count)
[DogStatsD] [Prometheus] The number of samples of consul.memberlist.gossip
consul.memberlist.gossip.max
(gauge)
[DogStatsD] The max for the number of gossips (messages) broadcasted to a set of randomly selected nodes.
Shown as message
consul.memberlist.gossip.median
(gauge)
[DogStatsD] The median for the number of gossips (messages) broadcasted to a set of randomly selected nodes.
Shown as message
consul.memberlist.gossip.quantile
(gauge)
[Prometheus] The quantile for the number of gossips (messages) broadcasted to a set of randomly selected nodes.
Shown as message
consul.memberlist.gossip.sum
(count)
[DogStatsD] [Prometheus] The sum of the number of gossips (messages) broadcasted to a set of randomly selected nodes.
Shown as message
consul.memberlist.health.score
(gauge)
[DogStatsD] [Prometheus] This metric describes a node's perception of its own health based on how well it is meeting the soft real-time requirements of the protocol. This metric ranges from 0 to 8, where 0 indicates "totally healthy". For more details see section IV of the Lifeguard paper: https://arxiv.org/pdf/1707.00788.pdf
consul.memberlist.msg.alive
(count)
[DogStatsD] [Prometheus] This metric counts the number of alive Consul agents, that the agent has mapped out so far, based on the message information given by the network layer.
consul.memberlist.msg.dead
(count)
[DogStatsD] [Prometheus] This metric counts the number of times a Consul agent has marked another agent to be a dead node.
Shown as message
consul.memberlist.msg.suspect
(count)
[DogStatsD] [Prometheus] The number of times a Consul agent suspects another as failed while probing during gossip protocol
consul.memberlist.probenode.95percentile
(gauge)
[DogStatsD] The p95 for the time taken to perform a single round of failure detection on a select Consul agent.
Shown as node
consul.memberlist.probenode.avg
(gauge)
[DogStatsD] The avg for the time taken to perform a single round of failure detection on a select Consul agent.
Shown as node
consul.memberlist.probenode.count
(count)
[DogStatsD] [Prometheus] The number of samples of consul.memberlist.probenode
consul.memberlist.probenode.max
(gauge)
[DogStatsD] The max for the time taken to perform a single round of failure detection on a select Consul agent.
Shown as node
consul.memberlist.probenode.median
(gauge)
[DogStatsD] The median for the time taken to perform a single round of failure detection on a select Consul agent.
Shown as node
consul.memberlist.probenode.quantile
(gauge)
[Prometheus] The quantile for the time taken to perform a single round of failure detection on a select Consul agent.
Shown as node
consul.memberlist.probenode.sum
(count)
[DogStatsD] [Prometheus] The sum for the time taken to perform a single round of failure detection on a select Consul agent.
Shown as node
consul.memberlist.pushpullnode.95percentile
(gauge)
[DogStatsD] The p95 for the number of Consul agents that have exchanged state with this agent.
Shown as node
consul.memberlist.pushpullnode.avg
(gauge)
[DogStatsD] The avg for the number of Consul agents that have exchanged state with this agent.
Shown as node
consul.memberlist.pushpullnode.count
(count)
[DogStatsD] [Prometheus] The number of samples of consul.memberlist.pushpullnode
consul.memberlist.pushpullnode.max
(gauge)
[DogStatsD] The max for the number of Consul agents that have exchanged state with this agent.
Shown as node
consul.memberlist.pushpullnode.median
(gauge)
[DogStatsD] The median for the number of Consul agents that have exchanged state with this agent.
Shown as node
consul.memberlist.pushpullnode.quantile
(gauge)
[Prometheus] The quantile for the number of Consul agents that have exchanged state with this agent.
consul.memberlist.pushpullnode.sum
(count)
[DogStatsD] [Prometheus] The sum for the number of Consul agents that have exchanged state with this agent.
consul.memberlist.tcp.accept
(count)
[DogStatsD] [Prometheus] This metric counts the number of times a Consul agent has accepted an incoming TCP stream connection.
Shown as connection
consul.memberlist.tcp.connect
(count)
[DogStatsD] [Prometheus] This metric counts the number of times a Consul agent has initiated a push/pull sync with an other agent.
Shown as connection
consul.memberlist.tcp.sent
(count)
[DogStatsD] [Prometheus] This metric measures the total number of bytes sent by a Consul agent through the TCP protocol
Shown as byte
consul.memberlist.udp.received
(count)
[DogStatsD] [Prometheus] This metric measures the total number of bytes sent/received by a Consul agent through the UDP protocol.
Shown as byte
consul.memberlist.udp.sent
(count)
[DogStatsD] [Prometheus] This metric measures the total number of bytes sent/received by a Consul agent through the UDP protocol.
Shown as byte
consul.net.node.latency.max
(gauge)
[Integration] Maximum latency from this node to all others
Shown as millisecond
consul.net.node.latency.median
(gauge)
[Integration] Median latency from this node to all others
Shown as millisecond
consul.net.node.latency.min
(gauge)
[Integration] Minimum latency from this node to all others
Shown as millisecond
consul.net.node.latency.p25
(gauge)
[Integration] P25 latency from this node to all others
Shown as millisecond
consul.net.node.latency.p75
(gauge)
[Integration] P75 latency from this node to all others
Shown as millisecond
consul.net.node.latency.p90
(gauge)
[Integration] P90 latency from this node to all others
Shown as millisecond
consul.net.node.latency.p95
(gauge)
[Integration] P95 latency from this node to all others
Shown as millisecond
consul.net.node.latency.p99
(gauge)
[Integration] P99 latency from this node to all others
Shown as millisecond
consul.peers
(gauge)
[Integration] The number of peers in the peer set
consul.raft.apply
(count)
[DogStatsD] [Prometheus] The number of raft transactions occurring
Shown as transaction
consul.raft.commitTime.95percentile
(gauge)
[DogStatsD] The p95 time it takes to commit a new entry to the raft log on the leader
Shown as millisecond
consul.raft.commitTime.avg
(gauge)
[DogStatsD] The average time it takes to commit a new entry to the raft log on the leader
Shown as millisecond
consul.raft.commitTime.count
(count)
[DogStatsD] [Prometheus] The number of samples of raft.commitTime
consul.raft.commitTime.max
(gauge)
[DogStatsD] The max time it takes to commit a new entry to the raft log on the leader
Shown as millisecond
consul.raft.commitTime.median
(gauge)
[DogStatsD] The median time it takes to commit a new entry to the raft log on the leader
Shown as millisecond
consul.raft.commitTime.quantile
(gauge)
[Prometheus] The quantile time it takes to commit a new entry to the raft log on the leader
Shown as millisecond
consul.raft.commitTime.sum
(count)
[DogStatsD] [Prometheus] The sum of the time it takes to commit a new entry to the raft log on the leader
Shown as millisecond
consul.raft.leader.dispatchLog.95percentile
(gauge)
[DogStatsD] The p95 time it takes for the leader to write log entries to disk
Shown as millisecond
consul.raft.leader.dispatchLog.avg
(gauge)
[DogStatsD] The average time it takes for the leader to write log entries to disk
Shown as millisecond
consul.raft.leader.dispatchLog.count
(count)
[DogStatsD] [Prometheus] The number of samples of raft.leader.dispatchLog
consul.raft.leader.dispatchLog.max
(gauge)
[DogStatsD] The max time it takes for the leader to write log entries to disk
Shown as millisecond
consul.raft.leader.dispatchLog.median
(gauge)
[DogStatsD] The median time it takes for the leader to write log entries to disk
Shown as millisecond
consul.raft.leader.dispatchLog.quantile
(gauge)
[Prometheus] The quantile time it takes for the leader to write log entries to disk
Shown as millisecond
consul.raft.leader.dispatchLog.sum
(count)
[DogStatsD] [Prometheus] The sum of the time it takes for the leader to write log entries to disk
Shown as millisecond
consul.raft.leader.lastContact.95percentile
(gauge)
[DogStatsD] The p95 time elapsed since the leader was last able to check its lease with followers
Shown as millisecond
consul.raft.leader.lastContact.avg
(gauge)
[DogStatsD] The average time elapsed since the leader was last able to check its lease with followers
Shown as millisecond
consul.raft.leader.lastContact.count
(count)
[DogStatsD] [Prometheus] The number of samples of raft.leader.lastContact
consul.raft.leader.lastContact.max
(gauge)
[DogStatsD] The max time elapsed since the leader was last able to check its lease with followers
Shown as millisecond
consul.raft.leader.lastContact.median
(gauge)
[DogStatsD] The median time elapsed since the leader was last able to check its lease with followers
Shown as millisecond
consul.raft.leader.lastContact.quantile
(gauge)
[Prometheus] The quantile time elapsed since the leader was last able to check its lease with followers
Shown as millisecond
consul.raft.leader.lastContact.sum
(count)
[DogStatsD] [Prometheus] The sum of the time elapsed since the leader was last able to check its lease with followers
Shown as millisecond
consul.raft.replication.appendEntries.logs
(count)
[DogStatsD] [Prometheus] Measures the number of logs replicated to an agent, to bring it up to speed with the leader's logs.
Shown as entry
consul.raft.replication.appendEntries.rpc.count
(count)
[DogStatsD] [Prometheus] The count the time taken by the append entries RFC to replicate the log entries of a leader agent onto its follower agent(s)
Shown as millisecond
consul.raft.replication.appendEntries.rpc.quantile
(gauge)
[Prometheus] The quantile of the time taken by the append entries RFC to replicate the log entries of a leader agent onto its follower agent(s)
Shown as millisecond
consul.raft.replication.appendEntries.rpc.sum
(count)
[DogStatsD] [Prometheus] The sum the time taken by the append entries RFC to replicate the log entries of a leader agent onto its follower agent(s)
Shown as millisecond
consul.raft.replication.heartbeat.count
(count)
[DogStatsD] [Prometheus] The count the time taken to invoke appendEntries on a peer.
Shown as millisecond
consul.raft.replication.heartbeat.quantile
(gauge)
[Prometheus] The quantile of the time taken to invoke appendEntries on a peer.
Shown as millisecond
consul.raft.replication.heartbeat.sum
(count)
[DogStatsD] [Prometheus] The sum of the time taken to invoke appendEntries on a peer.
Shown as millisecond
consul.raft.state.candidate
(count)
[DogStatsD] [Prometheus]The number of initiated leader elections
Shown as event
consul.raft.state.leader
(count)
[DogStatsD] [Prometheus] The number of completed leader elections
Shown as event
consul.runtime.gc_pause_ns.95percentile
(gauge)
[DogStatsD] The p95 for the number of nanoseconds consumed by stop-the-world garbage collection (GC) pauses since Consul started.
Shown as nanosecond
consul.runtime.gc_pause_ns.avg
(gauge)
[DogStatsD] The avg for the number of nanoseconds consumed by stop-the-world garbage collection (GC) pauses since Consul started.
Shown as nanosecond
consul.runtime.gc_pause_ns.count
(count)
[DogStatsD] [Prometheus] The number of samples of consul.runtime.gcpausens
consul.runtime.gc_pause_ns.max
(gauge)
[DogStatsD] The max for the number of nanoseconds consumed by stop-the-world garbage collection (GC) pauses since Consul started.
Shown as nanosecond
consul.runtime.gc_pause_ns.median
(gauge)
[DogStatsD] The median for the number of nanoseconds consumed by stop-the-world garbage collection (GC) pauses since Consul started.
Shown as nanosecond
consul.runtime.gc_pause_ns.quantile
(gauge)
[Prometheus] The quantile of nanoseconds consumed by stop-the-world garbage collection (GC) pauses since Consul started.
Shown as nanosecond
consul.runtime.gc_pause_ns.sum
(count)
[DogStatsD] [Prometheus] The sum of nanoseconds consumed by stop-the-world garbage collection (GC) pauses since Consul started.
Shown as nanosecond
consul.serf.coordinate.adjustment_ms.95percentile
(gauge)
[DogStatsD] The p95 in milliseconds for the node coordinate adjustment
Shown as millisecond
consul.serf.coordinate.adjustment_ms.avg
(gauge)
[DogStatsD] The avg in milliseconds for the node coordinate adjustment
Shown as millisecond
consul.serf.coordinate.adjustment_ms.count
(count)
[DogStatsD] [Prometheus] The number of samples of consul.serf.coordinate.adjustment_ms
consul.serf.coordinate.adjustment_ms.max
(gauge)
[DogStatsD] The max in milliseconds for the node coordinate adjustment
Shown as millisecond
consul.serf.coordinate.adjustment_ms.median
(gauge)
[DogStatsD] The median in milliseconds for the node coordinate adjustment
Shown as millisecond
consul.serf.coordinate.adjustment_ms.quantile
(gauge)
[Prometheus] The quantile in milliseconds for the node coordinate adjustment
Shown as millisecond
consul.serf.coordinate.adjustment_ms.sum
(count)
[DogStatsD] [Prometheus] The sum in milliseconds for the node coordinate adjustment
Shown as millisecond
consul.serf.events
(count)
[DogStatsD] [Prometheus] This increments when a Consul agent processes a serf event
Shown as event
consul.serf.member.failed
(count)
[DogStatsD] [Prometheus] This increments when a Consul agent is marked dead. This can be an indicator of overloaded agents, network problems, or configuration errors where agents cannot connect to each other on the required ports.
consul.serf.member.flap
(count)
[DogStatsD] [Prometheus] The number of times a Consul agent is marked dead and then quickly recovers
consul.serf.member.join
(count)
[DogStatsD] [Prometheus] This increments when a Consul agent processes a join event
Shown as event
consul.serf.member.left
(count)
[DogStatsD] [Prometheus] This increments when a Consul agent leaves the cluster.
consul.serf.member.update
(count)
[DogStatsD] [Prometheus] This increments when a Consul agent updates.
consul.serf.msgs.received.95percentile
(gauge)
[DogStatsD] The p95 for the number of serf messages received
Shown as message
consul.serf.msgs.received.avg
(gauge)
[DogStatsD] The avg for the number of serf messages received
Shown as message
consul.serf.msgs.received.count
(count)
[DogStatsD] [Prometheus] The count of serf messages received
consul.serf.msgs.received.max
(gauge)
[DogStatsD] The max for the number of serf messages received
Shown as message
consul.serf.msgs.received.median
(gauge)
[DogStatsD] The median for the number of serf messages received
Shown as message
consul.serf.msgs.received.quantile
(gauge)
[Prometheus] The quantile for the number of serf messages received
Shown as message
consul.serf.msgs.received.sum
(count)
[DogStatsD] [Prometheus] The sum for the number of serf messages received
Shown as message
consul.serf.msgs.sent.95percentile
(gauge)
[DogStatsD] The p95 for the number of serf messages sent
Shown as message
consul.serf.msgs.sent.avg
(gauge)
[DogStatsD] The avg for the number of serf messages sent
Shown as message
consul.serf.msgs.sent.count
(count)
[DogStatsD] [Prometheus] The count of serf messages sent
consul.serf.msgs.sent.max
(gauge)
[DogStatsD] The max for the number of serf messages sent
Shown as message
consul.serf.msgs.sent.median
(gauge)
[DogStatsD] The median for the number of serf messages sent
Shown as message
consul.serf.msgs.sent.quantile
(gauge)
[Prometheus] The quantile for the number of serf messages sent
Shown as message
consul.serf.msgs.sent.sum
(count)
[DogStatsD] [Prometheus] The sum of the number of serf messages sent
Shown as message
consul.serf.queue.event.95percentile
(gauge)
[DogStatsD] The p95 for the size of the serf event queue
consul.serf.queue.event.avg
(gauge)
[DogStatsD] The avg size of the serf event queue
consul.serf.queue.event.count
(count)
[DogStatsD] [Prometheus] The number of items in the serf event queue
consul.serf.queue.event.max
(gauge)
[DogStatsD] The max size of the serf event queue
consul.serf.queue.event.median
(gauge)
[DogStatsD] The median size of the serf event queue
consul.serf.queue.event.quantile
(gauge)
[Prometheus] The quantile for the size of the serf event queue
consul.serf.queue.intent.95percentile
(gauge)
[DogStatsD] The p95 for the size of the serf intent queue
consul.serf.queue.intent.avg
(gauge)
[DogStatsD] The avg size of the serf intent queue
consul.serf.queue.intent.count
(count)
[DogStatsD] [Prometheus] The number of items in the serf intent queue
consul.serf.queue.intent.max
(gauge)
[DogStatsD] The max size of the serf intent queue
consul.serf.queue.intent.median
(gauge)
[DogStatsD] The median size of the serf intent queue
consul.serf.queue.intent.quantile
(gauge)
[Prometheus] The quantile for the size of the serf intent queue
consul.serf.queue.query.95percentile
(gauge)
[DogStatsD] The p95 for the size of the serf query queue
consul.serf.queue.query.avg
(gauge)
[DogStatsD] The avg size of the serf query queue
consul.serf.queue.query.count
(count)
[DogStatsD] [Prometheus] The number of items in the serf query queue
consul.serf.queue.query.max
(gauge)
[DogStatsD] The max size of the serf query queue
consul.serf.queue.query.median
(gauge)
[DogStatsD] The median size of the serf query queue
consul.serf.queue.query.quantile
(gauge)
[Prometheus] The quantile for the size of the serf query queue
consul.serf.snapshot.appendline.95percentile
(gauge)
[DogStatsD] The p95 of the time taken by the Consul agent to append an entry into the existing log.
Shown as millisecond
consul.serf.snapshot.appendline.avg
(gauge)
[DogStatsD] The avg of the time taken by the Consul agent to append an entry into the existing log.
Shown as millisecond
consul.serf.snapshot.appendline.count
(count)
[DogStatsD] [Prometheus] The number of samples of consul.serf.snapshot.appendline
consul.serf.snapshot.appendline.max
(gauge)
[DogStatsD] The max of the time taken by the Consul agent to append an entry into the existing log.
Shown as millisecond
consul.serf.snapshot.appendline.median
(gauge)
[DogStatsD] The median of the time taken by the Consul agent to append an entry into the existing log.
Shown as millisecond
consul.serf.snapshot.appendline.quantile
(gauge)
[Prometheus] The quantile of the time taken by the Consul agent to append an entry into the existing log.
Shown as millisecond
consul.serf.snapshot.compact.95percentile
(gauge)
[DogStatsD] The p95 of the time taken by the Consul agent to compact a log. This operation occurs only when the snapshot becomes large enough to justify the compaction .
Shown as millisecond
consul.serf.snapshot.compact.avg
(gauge)
[DogStatsD] The avg of the time taken by the Consul agent to compact a log. This operation occurs only when the snapshot becomes large enough to justify the compaction .
Shown as millisecond
consul.serf.snapshot.compact.count
(count)
[DogStatsD] [Prometheus] The number of samples of consul.serf.snapshot.compact
consul.serf.snapshot.compact.max
(gauge)
[DogStatsD] The max of the time taken by the Consul agent to compact a log. This operation occurs only when the snapshot becomes large enough to justify the compaction .
Shown as millisecond
consul.serf.snapshot.compact.median
(gauge)
[DogStatsD] The median of the time taken by the Consul agent to compact a log. This operation occurs only when the snapshot becomes large enough to justify the compaction .
Shown as millisecond
consul.serf.snapshot.compact.quantile
(gauge)
[Prometheus] The quantile of the time taken by the Consul agent to compact a log. This operation occurs only when the snapshot becomes large enough to justify the compaction .
Shown as millisecond

Consulta el documento de telemetría del Consul para obtener una descripción de las métricas que el Consul Agent envía a DogStatsD.

Consulta el documento de coordenadas de red de Consul para obtener detalles sobre cómo se calculan las métricas de latencia de red.

Eventos

consul.new_leader:
El Datadog Agent emite un evento cuando el clúster de Consul elige un nuevo líder y lo etiqueta con prev_consul_leader, curr_consul_leader y consul_datacenter.

Checks de servicio

consul.check
Returns OK if the service is up, WARNING if there is an issue and CRITICAL when down.
Statuses: ok, warning, critical, unknown

consul.up
Returns OK if the consul server is up, CRITICAL otherwise.
Statuses: ok, critical

consul.can_connect
Returns OK if the Agent can make HTTP requests to consul, CRITICAL otherwise.
Statuses: ok, critical

consul.prometheus.health
Returns CRITICAL if the check cannot access the metrics endpoint, otherwise returns OK.
Statuses: ok, critical

Solucionar problemas

¿Necesitas ayuda? Ponte en contacto con el soporte de Datadog.

Referencias adicionales

Más enlaces, artículos y documentación útiles: