Istio
Security Monitoring is now available Security Monitoring is now available

Istio

Agent Check Agent Check

Supported OS: Linux Mac OS Windows

Overview

Use the Datadog Agent to monitor how well Istio is performing.

  • Collect metrics on what apps are making what kinds of requests
  • Look at how applications are using bandwidth
  • Understand Istio’s resource consumption

Setup

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.

Installation

Istio is included in the Datadog Agent. Install the Datadog Agent on your Istio servers or in your cluster and point it at Istio.

Configuration

Edit the istio.d/conf.yaml file (in the conf.d/ folder at the root of your Agent’s configuration directory) to connect to Istio. See the sample istio.d/conf.yaml for all available configuration options.

Metric Collection

Add one of the configuration blocks below to your istio.d/conf.yaml file to start gathering your Istio Metrics for your supported version:

  1. To monitor the istiod deployment in Istio v1.5+, use the following configuration:

    init_config:
        
    instances:
      - istiod_endpoint: http://istiod.istio-system:8080/metrics

To monitor Istio mesh metrics, continue to use istio_mesh_endpoint. Istio mesh metrics are now only available from istio-proxy containers which are supported out-of-the-box via autodiscovery, see istio.d/auto_conf.yaml.

  1. To monitor Istio versions v1.4 or earlier, use the following configuration:

    init_config:
    
    instances:
      - istio_mesh_endpoint: http://istio-telemetry.istio-system:42422/metrics
        mixer_endpoint: http://istio-telemetry.istio-system:15014/metrics
        galley_endpoint: http://istio-galley.istio-system:15014/metrics
        pilot_endpoint: http://istio-pilot.istio-system:15014/metrics
        citadel_endpoint: http://istio-citadel.istio-system:15014/metrics
        send_histograms_buckets: true

Each of the endpoints is optional, but at least one must be configured. See the Istio documentation to learn more about the Prometheus adapter.

Note: connectionID Prometheus label is excluded.

Disable sidecar injection

If you are installing the Datadog Agent in a container, Datadog recommends that you first disable Istio’s sidecar injection.

Add the sidecar.istio.io/inject: "false" annotation to the datadog-agent DaemonSet:

...
spec:
   ...
  template:
    metadata:
      annotations:
        sidecar.istio.io/inject: "false"
     ...

This can also be done with the kubectl patch command.

kubectl patch daemonset datadog-agent -p '{"spec":{"template":{"metadata":{"annotations":{"sidecar.istio.io/inject":"false"}}}}}'

Log collection

Istio contains two types of logs. Envoy access logs that are collected with the Envoy integration and Istio logs.

Available for Agent versions >6.0

See the Autodiscovery Integration Templates for guidance on applying the parameters below. Collecting logs is disabled by default in the Datadog Agent. To enable it, see Kubernetes log collection documentation.

ParameterValue
<LOG_CONFIG>{"source": "istio", "service": "<SERVICE_NAME>"}

Validation

Run the Agent’s info subcommand and look for istio under the Checks section.

Data Collected

Metrics

istio.mesh.request.count
(gauge)
The number of requests
Shown as request
istio.mesh.request.duration.count
(gauge)
count of request durations
Shown as request
istio.mesh.request.duration.sum
(gauge)
sum of request durations
Shown as millisecond
istio.mesh.request.size.count
(gauge)
count of request sizes
Shown as request
istio.mesh.request.size.sum
(gauge)
sum of request sizes
Shown as request
istio.mesh.response.size.count
(gauge)
count of response sizes
Shown as response
istio.mesh.response.size.sum
(gauge)
sum of response sizes
Shown as response
istio.mixer.adapter.dispatch_count
(gauge)
Total number of adapter dispatches handled by Mixer
Shown as operation
istio.mixer.adapter.dispatch_duration.count
(gauge)
Count of durations for adapter dispatches handled by Mixer
Shown as operation
istio.mixer.adapter.dispatch_duration.sum
(gauge)
Sum of durations for adapter dispatches handled by Mixer
Shown as operation
istio.mixer.adapter.old_dispatch_count
(gauge)
Total number of adapter dispatches handled by Mixer.
Shown as operation
istio.mixer.adapter.old_dispatch_duration.count
(gauge)
Count of times for adapter dispatches handled by Mixer.
Shown as operation
istio.mixer.adapter.old_dispatch_duration.sum
(gauge)
Sum of times for adapter dispatches handled by Mixer.
Shown as operation
istio.mixer.config.resolve_actions.count
(gauge)
Count of actions resolved by Mixer.
Shown as operation
istio.mixer.config.resolve_actions.sum
(gauge)
Sum of actions resolved by Mixer.
Shown as operation
istio.mixer.config.resolve_count
(gauge)
Number of config resolves handled by mixer
Shown as operation
istio.mixer.config.resolve_duration.count
(gauge)
Seconds per config resolve
Shown as second
istio.mixer.config.resolve_duration.sum
(gauge)
Sum of times for config resolves handled by Mixer.
Shown as second
istio.mixer.config.resolve_rules.count
(gauge)
Number of rules resolved by mixer
Shown as item
istio.mixer.config.resolve_rules.sum
(gauge)
Sum of rules resolved by Mixer.
Shown as item
istio.mixer.go.gc_duration_seconds.count
(gauge)
Count of the GC invocation durations.
Shown as second
istio.mixer.go.gc_duration_seconds.quantile
(gauge)
Quantile of the GC invocation durations.
Shown as second
istio.mixer.go.gc_duration_seconds.sum
(gauge)
Sum of the GC invocation durations.
Shown as second
istio.mixer.go.goroutines
(gauge)
Number of goroutines that currently exist.
Shown as thread
istio.mixer.go.info
(gauge)
Information about the Go environment.
istio.mixer.go.memstats.alloc_bytes
(gauge)
Number of bytes allocated and still in use.
Shown as byte
istio.mixer.go.memstats.alloc_bytes_total
(gauge)
Total number of bytes allocated even if freed.
Shown as byte
istio.mixer.go.memstats.buck_hash_sys_bytes
(gauge)
Number of bytes used by the profiling bucket hash table.
Shown as byte
istio.mixer.go.memstats.frees_total
(gauge)
Total number of frees.
istio.mixer.go.memstats.gc_cpu_fraction
(gauge)
CPU taken up by GC
Shown as percent
istio.mixer.go.memstats.gc_sys_bytes
(gauge)
Number of bytes used for garbage collection system metadata.
Shown as byte
istio.mixer.go.memstats.heap_alloc_bytes
(gauge)
Bytes allocated to the heap
Shown as byte
istio.mixer.go.memstats.heap_idle_bytes
(gauge)
Number of idle bytes in the heap
Shown as byte
istio.mixer.go.memstats.heap_inuse_bytes
(gauge)
Number of Bytes in the heap
Shown as byte
istio.mixer.go.memstats.heap_objects
(gauge)
Number of objects in the heap
Shown as object
istio.mixer.go.memstats.heap_released_bytes
(gauge)
Number of bytes released to the system in the last gc
Shown as byte
istio.mixer.go.memstats.heap_sys_bytes
(gauge)
Number of bytes used by the heap
Shown as byte
istio.mixer.go.memstats.last_gc_time_seconds
(gauge)
Length of last GC
Shown as second
istio.mixer.go.memstats.lookups_total
(gauge)
Number of lookups
Shown as operation
istio.mixer.go.memstats.mallocs_total
(gauge)
Number of mallocs
Shown as operation
istio.mixer.go.memstats.mcache_inuse_bytes
(gauge)
Number of bytes in use by mcache structures.
Shown as byte
istio.mixer.go.memstats.mcache_sys_bytes
(gauge)
Number of bytes used for mcache structures obtained from system.
Shown as byte
istio.mixer.go.memstats.mspan_inuse_bytes
(gauge)
Number of bytes in use by mspan structures.
Shown as byte
istio.mixer.go.memstats.mspan_sys_bytes
(gauge)
Number of bytes used for mspan structures obtained from system.
Shown as byte
istio.mixer.go.memstats.next_gc_bytes
(gauge)
Number of heap bytes when next garbage collection will take place
Shown as byte
istio.mixer.go.memstats.other_sys_bytes
(gauge)
Number of bytes used for other system allocations
Shown as byte
istio.mixer.go.memstats.stack_inuse_bytes
(gauge)
Number of bytes in use by the stack allocator
Shown as byte
istio.mixer.go.memstats.stack_sys_bytes
(gauge)
Number of bytes obtained from system for stack allocator
Shown as byte
istio.mixer.go.memstats.sys_bytes
(gauge)
Number of bytes obtained from system
Shown as byte
istio.mixer.go.threads
(gauge)
Number of OS threads created.
Shown as thread
istio.mixer.grpc.server.handled_total
(gauge)
Total number of fully handled requests, with responses
Shown as request
istio.mixer.grpc.server.handling_seconds.count
(gauge)
Count of response latency (seconds) of gRPC that had been application-level handled by the server.
Shown as second
istio.mixer.grpc.server.handling_seconds.sum
(gauge)
Sum of response latency (seconds) of gRPC that had been application-level handled by the server.
Shown as second
istio.mixer.grpc.server.msg_received_total
(gauge)
Total number of RPC stream messages received on the server.
Shown as message
istio.mixer.grpc.server.msg_sent_total
(gauge)
Total number of messages sent
Shown as message
istio.mixer.grpc.server.started_total
(gauge)
Total number of RPCs started on the server.
istio.mixer.process.cpu_seconds_total
(gauge)
Total user and system CPU time spent in seconds.
Shown as second
istio.mixer.process.max_fds
(gauge)
Maximum number of open file descriptors.
Shown as file
istio.mixer.process.open_fds
(gauge)
Number of open file descriptors.
Shown as file
istio.mixer.process.resident_memory_bytes
(gauge)
Resident memory size in bytes.
Shown as byte
istio.mixer.process.start_time_seconds
(gauge)
Start time of the process since unix epoch in seconds.
Shown as second
istio.mixer.process.virtual_memory_bytes
(gauge)
Virtual memory size in bytes.
Shown as byte
istio.mixer.grpc_io_server.completed_rpcs
(gauge)
Count of RPCs by method and status.
istio.mixer.grpc_io_server.received_bytes_per_rpc
(gauge)
Distribution of received bytes per RPC, by method.
Shown as byte
istio.mixer.grpc_io_server.sent_bytes_per_rpc
(gauge)
Distribution of total sent bytes per RPC, by method.
Shown as byte
istio.mixer.grpc_io_server.server_latency
(gauge)
Distribution of server latency in milliseconds, by method.
istio.mixer.config.attributes_total
(gauge)
The number of known attributes in the current config.
istio.mixer.config.handler_configs_total
(gauge)
The number of known handlers in the current config.
istio.mixer.config.instance_configs_total
(gauge)
The number of known instances in the current config.
istio.mixer.config.rule_configs_total
(gauge)
The number of known rules in the current config.
istio.mixer.dispatcher.destinations_per_request
(gauge)
Number of handlers dispatched per request by Mixer.
istio.mixer.dispatcher.instances_per_request
(gauge)
Number of instances created per request by Mixer.
istio.mixer.handler.daemons_total
(gauge)
The current number of active daemon routines in a given adapter environment.
istio.mixer.handler.new_handlers_total
(gauge)
The number of handlers that were newly created during config transition.
istio.mixer.mcp_sink.reconnections
(gauge)
The number of times the sink has reconnected.
istio.mixer.mcp_sink.request_acks_total
(gauge)
The number of request acks received by the source.
istio.mixer.runtime.dispatches_total
(gauge)
Total number of adapter dispatches handled by Mixer.
Shown as operation
istio.mixer.runtime.dispatch_duration_seconds
(gauge)
Duration in seconds for adapter dispatches handled by Mixer.
Shown as second
istio.pilot.go.gc_duration_seconds.count
(gauge)
Count of the GC invocation durations.
Shown as second
istio.pilot.go.gc_duration_seconds.quantile
(gauge)
Quantile of the GC invocation durations.
Shown as second
istio.pilot.go.gc_duration_seconds.sum
(gauge)
Sum of the GC invocation durations.
Shown as second
istio.pilot.go.goroutines
(gauge)
Number of goroutines that currently exist.
Shown as thread
istio.pilot.go.info
(gauge)
Information about the Go environment.
istio.pilot.go.memstats.alloc_bytes
(gauge)
Number of bytes allocated and still in use.
Shown as byte
istio.pilot.go.memstats.alloc_bytes_total
(gauge)
Total number of bytes allocated even if freed.
Shown as byte
istio.pilot.go.memstats.buck_hash_sys_bytes
(gauge)
Number of bytes used by the profiling bucket hash table.
Shown as byte
istio.pilot.go.memstats.frees_total
(gauge)
Total number of frees.
istio.pilot.go.memstats.gc_cpu_fraction
(gauge)
CPU taken up by GC
Shown as percent
istio.pilot.go.memstats.gc_sys_bytes
(gauge)
Number of bytes used for garbage collection system metadata.
Shown as byte
istio.pilot.go.memstats.heap_alloc_bytes
(gauge)
Bytes allocated to the heap
Shown as byte
istio.pilot.go.memstats.heap_idle_bytes
(gauge)
Number of idle bytes in the heap
Shown as byte
istio.pilot.go.memstats.heap_inuse_bytes
(gauge)
Number of Bytes in the heap
Shown as byte
istio.pilot.go.memstats.heap_objects
(gauge)
Number of objects in the heap
Shown as object
istio.pilot.go.memstats.heap_released_bytes
(gauge)
Number of bytes released to the system in the last gc
Shown as byte
istio.pilot.go.memstats.heap_sys_bytes
(gauge)
Number of bytes used by the heap
Shown as byte
istio.pilot.go.memstats.last_gc_time_seconds
(gauge)
Length of last GC
Shown as second
istio.pilot.go.memstats.lookups_total
(gauge)
Number of lookups
Shown as operation
istio.pilot.go.memstats.mallocs_total
(gauge)
Number of mallocs
Shown as operation
istio.pilot.go.memstats.mcache_inuse_bytes
(gauge)
Number of bytes in use by mcache structures.
Shown as byte
istio.pilot.go.memstats.mcache_sys_bytes
(gauge)
Number of bytes used for mcache structures obtained from system.
Shown as byte
istio.pilot.go.memstats.mspan_inuse_bytes
(gauge)
Number of bytes in use by mspan structures.
Shown as byte
istio.pilot.go.memstats.mspan_sys_bytes
(gauge)
Number of bytes used for mspan structures obtained from system.
Shown as byte
istio.pilot.go.memstats.next_gc_bytes
(gauge)
Number of heap bytes when next garbage collection will take place
Shown as byte
istio.pilot.go.memstats.other_sys_bytes
(gauge)
Number of bytes used for other system allocations
Shown as byte
istio.pilot.go.memstats.stack_inuse_bytes
(gauge)
Number of bytes in use by the stack allocator
Shown as byte
istio.pilot.go.memstats.stack_sys_bytes
(gauge)
Number of bytes obtained from system for stack allocator
Shown as byte
istio.pilot.go.memstats.sys_bytes
(gauge)
Number of bytes obtained from system
Shown as byte
istio.pilot.go.threads
(gauge)
Number of OS threads created.
Shown as thread
istio.pilot.process.cpu_seconds_total
(gauge)
Total user and system CPU time spent in seconds.
Shown as second
istio.pilot.process.max_fds
(gauge)
Maximum number of open file descriptors.
Shown as file
istio.pilot.process.open_fds
(gauge)
Number of open file descriptors.
Shown as file
istio.pilot.process.resident_memory_bytes
(gauge)
Resident memory size in bytes.
Shown as byte
istio.pilot.process.start_time_seconds
(gauge)
Start time of the process since unix epoch in seconds.
Shown as second
istio.pilot.process.virtual_memory_bytes
(gauge)
Virtual memory size in bytes.
Shown as byte
istio.pilot.conflict.inbound_listener
(gauge)
Number of conflicting inbound listeners.
istio.pilot.conflict.outbound_listener.http_over_current_tcp
(gauge)
Number of conflicting wildcard http listeners with current wildcard tcp listener.
istio.pilot.conflict.outbound_listener.tcp_over_current_http
(gauge)
Number of conflicting wildcard tcp listeners with current wildcard http listener.
istio.pilot.conflict.outbound_listener.tcp_over_current_tcp
(gauge)
Number of conflicting tcp listeners with current tcp listener.
istio.pilot.destrule_subsets
(gauge)
Duplicate subsets across destination rules for same host.
istio.pilot.duplicate_envoy_clusters
(gauge)
Duplicate envoy clusters caused by service entries with same hostname.
istio.pilot.eds_no_instances
(gauge)
Number of clusters without instances.
istio.pilot.endpoint_not_ready
(gauge)
Endpoint found in unready state.
istio.pilot.invalid_out_listeners
(gauge)
Number of invalid outbound listeners.
istio.pilot.mcp_sink.reconnections
(count)
The number of times the sink has reconnected.
istio.pilot.mcp_sink.recv_failures_total
(count)
The number of recv failures in the source.
istio.pilot.mcp_sink.request_acks_total
(count)
The number of request acks received by the source.
istio.pilot.no_ip
(gauge)
Pods not found in the endpoint table, possibly invalid.
istio.pilot.proxy_convergence_time
(gauge)
Delay between config change and all proxies converging.
Shown as second
istio.pilot.rds_expired_nonce
(count)
Total number of RDS messages with an expired nonce.
istio.pilot.services
(gauge)
Total services known to pilot.
istio.pilot.total_xds_internal_errors
(count)
Total number of internal XDS errors in pilot.
istio.pilot.total_xds_rejects
(count)
Total number of XDS responses from pilot rejected by proxy.
istio.pilot.virt_services
(gauge)
Total virtual services known to pilot.
istio.pilot.vservice_dup_domain
(gauge)
Virtual services with dup domains.
istio.pilot.xds
(gauge)
Number of endpoints connected to this pilot using XDS.
istio.pilot.xds.eds_instances
(gauge)
Instances for each cluster, as of last push.
istio.pilot.xds.push.context_errors
(count)
Number of errors (timeouts) initiating push context.
istio.pilot.xds.push.timeout
(count)
Pilot push timeout, will retry.
istio.pilot.xds.push.timeout_failures
(count)
Pilot push timeout failures after repeated attempts.
istio.pilot.xds.pushes
(count)
Pilot build and send errors for lds, rds, cds and eds.
istio.pilot.xds.write_timeout
(count)
Pilot XDS response write timeouts.
istio.galley.go.gc_duration_seconds.count
(gauge)
Count of the GC invocation durations.
Shown as second
istio.galley.go.gc_duration_seconds.quantile
(gauge)
Quantile of the GC invocation durations.
Shown as second
istio.galley.go.gc_duration_seconds.sum
(gauge)
Sum of the GC invocation durations.
Shown as second
istio.galley.go.goroutines
(gauge)
Number of goroutines that currently exist.
Shown as thread
istio.galley.go.info
(gauge)
Information about the Go environment.
istio.galley.go.memstats.alloc_bytes
(gauge)
Number of bytes allocated and still in use.
Shown as byte
istio.galley.go.memstats.alloc_bytes_total
(gauge)
Total number of bytes allocated even if freed.
Shown as byte
istio.galley.go.memstats.buck_hash_sys_bytes
(gauge)
Number of bytes used by the profiling bucket hash table.
Shown as byte
istio.galley.go.memstats.frees_total
(gauge)
Total number of frees.
istio.galley.go.memstats.gc_cpu_fraction
(gauge)
CPU taken up by GC
Shown as percent
istio.galley.go.memstats.gc_sys_bytes
(gauge)
Number of bytes used for garbage collection system metadata.
Shown as byte
istio.galley.go.memstats.heap_alloc_bytes
(gauge)
Bytes allocated to the heap
Shown as byte
istio.galley.go.memstats.heap_idle_bytes
(gauge)
Number of idle bytes in the heap
Shown as byte
istio.galley.go.memstats.heap_inuse_bytes
(gauge)
Number of Bytes in the heap
Shown as byte
istio.galley.go.memstats.heap_objects
(gauge)
Number of objects in the heap
Shown as object
istio.galley.go.memstats.heap_released_bytes
(gauge)
Number of bytes released to the system in the last gc
Shown as byte
istio.galley.go.memstats.heap_sys_bytes
(gauge)
Number of bytes used by the heap
Shown as byte
istio.galley.go.memstats.last_gc_time_seconds
(gauge)
Length of last GC
Shown as second
istio.galley.go.memstats.lookups_total
(gauge)
Number of lookups
Shown as operation
istio.galley.go.memstats.mallocs_total
(gauge)
Number of mallocs
Shown as operation
istio.galley.go.memstats.mcache_inuse_bytes
(gauge)
Number of bytes in use by mcache structures.
Shown as byte
istio.galley.go.memstats.mcache_sys_bytes
(gauge)
Number of bytes used for mcache structures obtained from system.
Shown as byte
istio.galley.go.memstats.mspan_inuse_bytes
(gauge)
Number of bytes in use by mspan structures.
Shown as byte
istio.galley.go.memstats.mspan_sys_bytes
(gauge)
Number of bytes used for mspan structures obtained from system.
Shown as byte
istio.galley.go.memstats.next_gc_bytes
(gauge)
Number of heap bytes when next garbage collection will take place
Shown as byte
istio.galley.go.memstats.other_sys_bytes
(gauge)
Number of bytes used for other system allocations
Shown as byte
istio.galley.go.memstats.stack_inuse_bytes
(gauge)
Number of bytes in use by the stack allocator
Shown as byte
istio.galley.go.memstats.stack_sys_bytes
(gauge)
Number of bytes obtained from system for stack allocator
Shown as byte
istio.galley.go.memstats.sys_bytes
(gauge)
Number of bytes obtained from system
Shown as byte
istio.galley.go.threads
(gauge)
Number of OS threads created.
Shown as thread
istio.galley.process.cpu_seconds_total
(gauge)
Total user and system CPU time spent in seconds.
Shown as second
istio.galley.process.max_fds
(gauge)
Maximum number of open file descriptors.
Shown as file
istio.galley.process.open_fds
(gauge)
Number of open file descriptors.
Shown as file
istio.galley.process.resident_memory_bytes
(gauge)
Resident memory size in bytes.
Shown as byte
istio.galley.process.start_time_seconds
(gauge)
Start time of the process since unix epoch in seconds.
Shown as second
istio.galley.process.virtual_memory_bytes
(gauge)
Virtual memory size in bytes.
Shown as byte
istio.galley.endpoint_no_pod
(gauge)
Endpoints without an associated pod.
istio.galley.mcp_source.clients_total
(gauge)
The number of streams currently connected.
istio.galley.runtime_processor.event_span_duration_milliseconds
(gauge)
The duration between each incoming event.
Shown as millisecond
istio.galley.runtime_processor.events_processed_total
(gauge)
The number of events that have been processed.
istio.galley.runtime_processor.snapshot_events_total
(gauge)
The number of events per snapshot.
istio.galley.runtime_processor.snapshot_lifetime_duration_milliseconds
(gauge)
The duration of each snapshot.
Shown as millisecond
istio.galley.runtime_processor.snapshots_published_total
(count)
The number of snapshots that have been published.
istio.galley.runtime_state_type_instances_total
(gauge)
The number of type instances per type URL.
istio.galley.runtime_strategy.on_change_total
(count)
The number of times the strategy's onChange has been called.
istio.galley.runtime_strategy.timer_max_time_reached_total
(count)
The number of times the max time has been reached.
istio.galley.runtime_strategy.quiesce_reached_total
(count)
The number of times a quiesce has been reached.
istio.galley.runtime_strategy.timer_resets_total
(count)
The number of times the timer has been reset.
istio.galley.source_kube.dynamic_converter_success_total
(count)
The number of times a dynamic kubernetes source successfully converted a resource.
istio.galley.source_kube.event_success_total
(count)
The number of times a kubernetes source successfully handled an event.
istio.galley.validation.cert_key_updates
(count)
Galley validation webhook certificate updates.
istio.galley.validation.config_load
(count)
k8s webhook configuration (re)loads.
istio.galley.validation.config_update
(count)
k8s webhook configuration updates.
istio.galley.validation.passed
(count)
Resource is valid.
istio.citadel.secret_controller.csr_err_count
(count)
The number of errors occurred when creating the CSR.
istio.citadel.secret_controller.secret_deleted_cert_count
(count)
The number of certificates recreated due to secret deletion (service account still exists).
istio.citadel.secret_controller.svc_acc_created_cert_count
(count)
The number of certificates created due to service account creation.
istio.citadel.secret_controller.svc_acc_deleted_cert_count
(count)
The number of certificates deleted due to service account deletion.
istio.citadel.server.authentication_failure_count
(count)
The number of authentication failures.
Shown as error
istio.citadel.server.citadel_root_cert_expiry_timestamp
(gauge)
The unix timestamp, in seconds, when Citadel root cert will expire. We set it to negative in case of internal error.
Shown as second
istio.citadel.server.csr_count
(count)
The number of CSRs received by Citadel server.
istio.citadel.server.csr_parsing_err_count
(count)
The number of errors occurred when parsing the CSR.
Shown as error
istio.citadel.server.id_extraction_err_count
(count)
The number of errors occurred when extracting the ID from CSR.
Shown as error
istio.citadel.server.success_cert_issuance_count
(count)
The number of certificates issuances that have succeeded.
istio.citadel.go.gc_duration_seconds.count
(gauge)
Count of the GC invocation durations.
Shown as second
istio.citadel.go.gc_duration_seconds.quantile
(gauge)
Quantile of the GC invocation durations.
Shown as second
istio.citadel.go.gc_duration_seconds.sum
(gauge)
Sum of the GC invocation durations.
Shown as second
istio.citadel.go.goroutines
(gauge)
Number of goroutines that currently exist.
Shown as thread
istio.citadel.go.info
(gauge)
Information about the Go environment.
istio.citadel.go.memstats.alloc_bytes
(gauge)
Number of bytes allocated and still in use.
Shown as byte
istio.citadel.go.memstats.alloc_bytes_total
(count)
Total number of bytes allocated even if freed.
Shown as byte
istio.citadel.go.memstats.buck_hash_sys_bytes
(gauge)
Number of bytes used by the profiling bucket hash table.
Shown as byte
istio.citadel.go.memstats.frees_total
(count)
Total number of frees.
istio.citadel.go.memstats.gc_cpu_fraction
(gauge)
CPU taken up by GC
Shown as percent
istio.citadel.go.memstats.gc_sys_bytes
(gauge)
Number of bytes used for garbage collection system metadata.
Shown as byte
istio.citadel.go.memstats.heap_alloc_bytes
(gauge)
Bytes allocated to the heap
Shown as byte
istio.citadel.go.memstats.heap_idle_bytes
(gauge)
Number of idle bytes in the heap
Shown as byte
istio.citadel.go.memstats.heap_inuse_bytes
(gauge)
Number of Bytes in the heap
Shown as byte
istio.citadel.go.memstats.heap_objects
(gauge)
Number of objects in the heap
Shown as object
istio.citadel.go.memstats.heap_released_bytes
(gauge)
Number of bytes released to the system in the last gc
Shown as byte
istio.citadel.go.memstats.heap_sys_bytes
(gauge)
Number of bytes used by the heap
Shown as byte
istio.citadel.go.memstats.last_gc_time_seconds
(gauge)
Length of last GC
Shown as second
istio.citadel.go.memstats.lookups_total
(count)
Number of lookups
Shown as operation
istio.citadel.go.memstats.mallocs_total
(count)
Number of mallocs
Shown as operation
istio.citadel.go.memstats.mcache_inuse_bytes
(gauge)
Number of bytes in use by mcache structures.
Shown as byte
istio.citadel.go.memstats.mcache_sys_bytes
(gauge)
Number of bytes used for mcache structures obtained from system.
Shown as byte
istio.citadel.go.memstats.mspan_inuse_bytes
(gauge)
Number of bytes in use by mspan structures.
Shown as byte
istio.citadel.go.memstats.mspan_sys_bytes
(gauge)
Number of bytes used for mspan structures obtained from system.
Shown as byte
istio.citadel.go.memstats.next_gc_bytes
(gauge)
Number of heap bytes when next garbage collection will take place
Shown as byte
istio.citadel.go.memstats.other_sys_bytes
(gauge)
Number of bytes used for other system allocations
Shown as byte
istio.citadel.go.memstats.stack_inuse_bytes
(gauge)
Number of bytes in use by the stack allocator
Shown as byte
istio.citadel.go.memstats.stack_sys_bytes
(gauge)
Number of bytes obtained from system for stack allocator
Shown as byte
istio.citadel.go.memstats.sys_bytes
(gauge)
Number of bytes obtained from system
Shown as byte
istio.citadel.go.threads
(gauge)
Number of OS threads created.
Shown as thread
istio.citadel.process.cpu_seconds_total
(gauge)
Total user and system CPU time spent in seconds.
Shown as second
istio.citadel.process.max_fds
(gauge)
Maximum number of open file descriptors.
Shown as file
istio.citadel.process.open_fds
(gauge)
Number of open file descriptors.
Shown as file
istio.citadel.process.resident_memory_bytes
(gauge)
Resident memory size in bytes.
Shown as byte
istio.citadel.process.start_time_seconds
(gauge)
Start time of the process since unix epoch in seconds.
Shown as second
istio.citadel.process.virtual_memory_bytes
(gauge)
Virtual memory size in bytes.
Shown as byte
istio.galley.validation.config_update_error
(count)
K8s webhook configuration update error
Shown as error
istio.citadel.server.root_cert_expiry_timestamp
(gauge)
The unix timestamp (in seconds) when Citadel root cert will expire. Negative in case of internal error
Shown as second
istio.galley.validation.failed
(count)
Count of resource validation failed
istio.pilot.conflict.outbound_listener.http_over_https
(gauge)
Number of conflicting HTTP listeners with well known HTTPS ports
istio.pilot.inbound_updates
(count)
Total number of updates received by pilot
istio.pilot.k8s.cfg_events
(count)
Events from k8s config
Shown as event
istio.pilot.k8s.reg_events
(count)
Events from k8s registry
Shown as event
istio.pilot.proxy_queue_time.count
(count)
Count of observation for when proxy is in a push queue before being dequeued
istio.pilot.proxy_queue_time.sum
(gauge)
Sum of observed values for when proxy is in a push queue before being dequeued
istio.pilot.push.triggers
(count)
Total number of times a push was triggered
Shown as event
istio.pilot.xds.eds_all_locality_endpoints
(gauge)
Network endpoints for each cluster (across all localities) as of last push. Zero endpoints is an error
istio.pilot.xds.push.time.count
(count)
Count of observation of total time Pilot takes a push
istio.pilot.xds.push.time.sum
(gauge)
Sum of observed values of total time Pilot takes a push
istio.process.virtual_memory_max_bytes
(gauge)
Maximum amount of virtual memory available in bytes
Shown as byte
istio.sidecar_injection.requests_total
(count)
Total number of sidecar injection requests
Shown as request
istio.sidecar_injection.success_total
(count)
Total number of successful sidecar injection requests
Shown as request
istio.mesh.request.duration.milliseconds.sum
(gauge)
Total sum of observed values for duration of requests in ms
Shown as millisecond
istio.mesh.request.duration.milliseconds.count
(gauge)
Total count of observed values for duration of requests
istio.mesh.tcp.connections_closed.total
(gauge)
Total closed connections
istio.mesh.tcp.connections_opened.total
(gauge)
Total opened connections
istio.mesh.tcp.received_bytes.total
(gauge)
Size of total bytes received during request in case of a TCP connection
Shown as byte
istio.mesh.tcp.send_bytes.total
(gauge)
Size of total bytes sent during response in case of a TCP connection
Shown as byte
istio.mesh.request.count.total
(count)
The number of requests as monotonic count
Shown as byte
istio.mesh.request.duration.milliseconds.count.total
(count)
Total count of observed values for duration of requests as monotonic count
istio.mesh.request.duration.milliseconds.sum.total
(count)
Total sum of observed values for duration of requests as monotonic count
istio.mesh.request.size.count.total
(count)
Count of observed request sizes as monotonic count
istio.mesh.request.size.sum.total
(count)
Sum of observed request sizes as monotonic count
istio.mesh.response.size.count.total
(count)
Count of observed response size as monotonic count
istio.mesh.response.size.sum.total
(count)
Sum of observed response size as monotonic count
istio.mesh.tcp.connections_closed.total.total
(count)
Total closed connections as monotonic count
istio.mesh.tcp.connections_opened.total.total
(count)
Total opened connections as monotonic count
istio.mesh.tcp.received_bytes.total.total
(count)
Size of total bytes received during request in case of a TCP connection as monotonic count
Shown as byte
istio.mesh.tcp.send_bytes.total.total
(count)
Size of total bytes sent during response in case of a TCP connection as monotonic count
Shown as byte
istio.mesh.request.duration.count.total
(count)
Count of request durations as monotonic count
Shown as request
istio.mesh.request.duration.sum.total
(count)
Sum of request durations as monotonic count
Shown as millisecond

Events

The Istio check does not include any events.

Service Checks

For Istio versions 1.5 or higher:

istio.prometheus.health: Returns CRITICAL if the Agent cannot reach the metrics endpoints, OK otherwise.

For all other versions of Istio:

istio.pilot.prometheus.health: Returns CRITICAL if the Agent cannot reach the metrics endpoints, OK otherwise.

istio.galley.prometheus.health: Returns CRITICAL if the Agent cannot reach the metrics endpoints, OK otherwise.

istio.citadel.prometheus.health: Returns CRITICAL if the Agent cannot reach the metrics endpoints, OK otherwise.

Troubleshooting

Need help? Contact Datadog support.

Further Reading

Additional helpful documentation, links, and articles: