Join us at the Dash conference! July 16-17, NYC

Istio

Agent Check Agent Check

Supported OS: Linux Mac OS Windows

Overview

Use the Datadog Agent to monitor how well Istio is performing.

  • Collect metrics on what apps are making what kinds of requests
  • Look at how applications are using bandwidth
  • Understand istio’s resource consumption

Setup

Installation

Istio is included in the Datadog Agent. So, just install the Agent on your istio servers or in your cluster and point it at Istio.

Configuration

Connect the Agent

Edit the istio.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory, to connect it to Istio. See the sample istio.d/conf.yaml for all available configuration options:

init_config:

instances:
  - istio_mesh_endpoint: http://istio-telemetry.istio-system:42422/metrics
    mixer_endpoint: http://istio-telemetry.istio-system:15014/metrics
    galley_endpoint: http://istio-galley.istio-system:15014/metrics
    pilot_endpoint: http://istio-pilot.istio-system:15014/metrics
    send_histograms_buckets: true

The first two endpoints are required for the check to work. See the istio documentation to learn more about the prometheus adapter.

Validation

Run the Agent’s info subcommand and look for istio under the Checks section.

Data Collected

Metrics

istio.mesh.request.count
(gauge)
The number of requests
shown as request
istio.mesh.request.duration.count
(gauge)
count of request durations
shown as request
istio.mesh.request.duration.sum
(gauge)
sum of request durations
shown as millisecond
istio.mesh.request.size.count
(gauge)
count of request sizes
shown as request
istio.mesh.request.size.sum
(gauge)
sum of request sizes
shown as request
istio.mesh.response.size.count
(gauge)
count of response sizes
shown as response
istio.mesh.response.size.sum
(gauge)
sum of response sizes
shown as response
istio.mixer.adapter.dispatch_count
(gauge)
Total number of adapter dispatches handled by Mixer
shown as operation
istio.mixer.adapter.dispatch_duration.count
(gauge)
Count of durations for adapter dispatches handled by Mixer
shown as operation
istio.mixer.adapter.dispatch_duration.sum
(gauge)
Sum of durations for adapter dispatches handled by Mixer
shown as operation
istio.mixer.adapter.old_dispatch_count
(gauge)
Total number of adapter dispatches handled by Mixer.
shown as operation
istio.mixer.adapter.old_dispatch_duration.count
(gauge)
Count of times for adapter dispatches handled by Mixer.
shown as operation
istio.mixer.adapter.old_dispatch_duration.sum
(gauge)
Sum of times for adapter dispatches handled by Mixer.
shown as operation
istio.mixer.config.resolve_actions.count
(gauge)
Count of actions resolved by Mixer.
shown as operation
istio.mixer.config.resolve_actions.sum
(gauge)
Sum of actions resolved by Mixer.
shown as operation
istio.mixer.config.resolve_count
(gauge)
Number of config resolves handled by mixer
shown as operation
istio.mixer.config.resolve_duration.count
(gauge)
Seconds per config resolve
shown as second
istio.mixer.config.resolve_duration.sum
(gauge)
Sum of times for config resolves handled by Mixer.
shown as second
istio.mixer.config.resolve_rules.count
(gauge)
Number of rules resolved by mixer
shown as item
istio.mixer.config.resolve_rules.sum
(gauge)
Sum of rules resolved by Mixer.
shown as item
istio.mixer.go.gc_duration_seconds.count
(gauge)
Count of the GC invocation durations.
shown as second
istio.mixer.go.gc_duration_seconds.quantile
(gauge)
Quantile of the GC invocation durations.
shown as second
istio.mixer.go.gc_duration_seconds.sum
(gauge)
Sum of the GC invocation durations.
shown as second
istio.mixer.go.goroutines
(gauge)
Number of goroutines that currently exist.
shown as thread
istio.mixer.go.info
(gauge)
Information about the Go environment.
istio.mixer.go.memstats.alloc_bytes
(gauge)
Number of bytes allocated and still in use.
shown as byte
istio.mixer.go.memstats.alloc_bytes_total
(gauge)
Total number of bytes allocated even if freed.
shown as byte
istio.mixer.go.memstats.buck_hash_sys_bytes
(gauge)
Number of bytes used by the profiling bucket hash table.
shown as byte
istio.mixer.go.memstats.frees_total
(gauge)
Total number of frees.
istio.mixer.go.memstats.gc_cpu_fraction
(gauge)
CPU taken up by GC
shown as percent
istio.mixer.go.memstats.gc_sys_bytes
(gauge)
Number of bytes used for garbage collection system metadata.
shown as byte
istio.mixer.go.memstats.heap_alloc_bytes
(gauge)
Bytes allocated to the heap
shown as byte
istio.mixer.go.memstats.heap_idle_bytes
(gauge)
Number of idle bytes in the heap
shown as byte
istio.mixer.go.memstats.heap_inuse_bytes
(gauge)
Number of Bytes in the heap
shown as byte
istio.mixer.go.memstats.heap_objects
(gauge)
Number of objects in the heap
shown as object
istio.mixer.go.memstats.heap_released_bytes
(gauge)
Number of bytes released to the system in the last gc
shown as byte
istio.mixer.go.memstats.heap_sys_bytes
(gauge)
Number of bytes used by the heap
shown as byte
istio.mixer.go.memstats.last_gc_time_seconds
(gauge)
Length of last GC
shown as second
istio.mixer.go.memstats.lookups_total
(gauge)
Number of lookups
shown as operation
istio.mixer.go.memstats.mallocs_total
(gauge)
Number of mallocs
shown as operation
istio.mixer.go.memstats.mcache_inuse_bytes
(gauge)
Number of bytes in use by mcache structures.
shown as byte
istio.mixer.go.memstats.mcache_sys_bytes
(gauge)
Number of bytes used for mcache structures obtained from system.
shown as byte
istio.mixer.go.memstats.mspan_inuse_bytes
(gauge)
Number of bytes in use by mspan structures.
shown as byte
istio.mixer.go.memstats.mspan_sys_bytes
(gauge)
Number of bytes used for mspan structures obtained from system.
shown as byte
istio.mixer.go.memstats.next_gc_bytes
(gauge)
Number of heap bytes when next garbage collection will take place
shown as byte
istio.mixer.go.memstats.other_sys_bytes
(gauge)
Number of bytes used for other system allocations
shown as byte
istio.mixer.go.memstats.stack_inuse_bytes
(gauge)
Number of bytes in use by the stack allocator
shown as byte
istio.mixer.go.memstats.stack_sys_bytes
(gauge)
Number of bytes obtained from system for stack allocator
shown as byte
istio.mixer.go.memstats.sys_bytes
(gauge)
Number of bytes obtained from system
shown as byte
istio.mixer.go.threads
(gauge)
Number of OS threads created.
shown as thread
istio.mixer.grpc.server.handled_total
(gauge)
Total number of fully handled requests, with responses
shown as request
istio.mixer.grpc.server.handling_seconds.count
(gauge)
Count of response latency (seconds) of gRPC that had been application-level handled by the server.
shown as second
istio.mixer.grpc.server.handling_seconds.sum
(gauge)
Sum of response latency (seconds) of gRPC that had been application-level handled by the server.
shown as second
istio.mixer.grpc.server.msg_received_total
(gauge)
Total number of RPC stream messages received on the server.
shown as message
istio.mixer.grpc.server.msg_sent_total
(gauge)
Total number of messages sent
shown as message
istio.mixer.grpc.server.started_total
(gauge)
Total number of RPCs started on the server.
istio.mixer.process.cpu_seconds_total
(gauge)
Total user and system CPU time spent in seconds.
shown as second
istio.mixer.process.max_fds
(gauge)
Maximum number of open file descriptors.
shown as file
istio.mixer.process.open_fds
(gauge)
Number of open file descriptors.
shown as file
istio.mixer.process.resident_memory_bytes
(gauge)
Resident memory size in bytes.
shown as byte
istio.mixer.process.start_time_seconds
(gauge)
Start time of the process since unix epoch in seconds.
shown as second
istio.mixer.process.virtual_memory_bytes
(gauge)
Virtual memory size in bytes.
shown as byte
istio.mixer.grpc_io_server.completed_rpcs
(gauge)
Count of RPCs by method and status.
istio.mixer.grpc_io_server.received_bytes_per_rpc
(gauge)
Distribution of received bytes per RPC, by method.
shown as byte
istio.mixer.grpc_io_server.sent_bytes_per_rpc
(gauge)
Distribution of total sent bytes per RPC, by method.
shown as byte
istio.mixer.grpc_io_server.server_latency
(gauge)
Distribution of server latency in milliseconds, by method.
istio.mixer.config.attributes_total
(gauge)
The number of known attributes in the current config.
istio.mixer.config.handler_configs_total
(gauge)
The number of known handlers in the current config.
istio.mixer.config.instance_configs_total
(gauge)
The number of known instances in the current config.
istio.mixer.config.rule_configs_total
(gauge)
The number of known rules in the current config.
istio.mixer.dispatcher.destinations_per_request
(gauge)
Number of handlers dispatched per request by Mixer.
istio.mixer.dispatcher.instances_per_request
(gauge)
Number of instances created per request by Mixer.
istio.mixer.handler.daemons_total
(gauge)
The current number of active daemon routines in a given adapter environment.
istio.mixer.handler.new_handlers_total
(gauge)
The number of handlers that were newly created during config transition.
istio.mixer.mcp_sink.reconnections
(gauge)
The number of times the sink has reconnected.
istio.mixer.mcp_sink.request_acks_total
(gauge)
The number of request acks received by the source.
istio.mixer.runtime.dispatches_total
(gauge)
Total number of adapter dispatches handled by Mixer.
shown as operation
istio.mixer.runtime.dispatch_duration_seconds
(gauge)
Duration in seconds for adapter dispatches handled by Mixer.
shown as second
istio.pilot.go.gc_duration_seconds.count
(gauge)
Count of the GC invocation durations.
shown as second
istio.pilot.go.gc_duration_seconds.quantile
(gauge)
Quantile of the GC invocation durations.
shown as second
istio.pilot.go.gc_duration_seconds.sum
(gauge)
Sum of the GC invocation durations.
shown as second
istio.pilot.go.goroutines
(gauge)
Number of goroutines that currently exist.
shown as thread
istio.pilot.go.info
(gauge)
Information about the Go environment.
istio.pilot.go.memstats.alloc_bytes
(gauge)
Number of bytes allocated and still in use.
shown as byte
istio.pilot.go.memstats.alloc_bytes_total
(gauge)
Total number of bytes allocated even if freed.
shown as byte
istio.pilot.go.memstats.buck_hash_sys_bytes
(gauge)
Number of bytes used by the profiling bucket hash table.
shown as byte
istio.pilot.go.memstats.frees_total
(gauge)
Total number of frees.
istio.pilot.go.memstats.gc_cpu_fraction
(gauge)
CPU taken up by GC
shown as percent
istio.pilot.go.memstats.gc_sys_bytes
(gauge)
Number of bytes used for garbage collection system metadata.
shown as byte
istio.pilot.go.memstats.heap_alloc_bytes
(gauge)
Bytes allocated to the heap
shown as byte
istio.pilot.go.memstats.heap_idle_bytes
(gauge)
Number of idle bytes in the heap
shown as byte
istio.pilot.go.memstats.heap_inuse_bytes
(gauge)
Number of Bytes in the heap
shown as byte
istio.pilot.go.memstats.heap_objects
(gauge)
Number of objects in the heap
shown as object
istio.pilot.go.memstats.heap_released_bytes
(gauge)
Number of bytes released to the system in the last gc
shown as byte
istio.pilot.go.memstats.heap_sys_bytes
(gauge)
Number of bytes used by the heap
shown as byte
istio.pilot.go.memstats.last_gc_time_seconds
(gauge)
Length of last GC
shown as second
istio.pilot.go.memstats.lookups_total
(gauge)
Number of lookups
shown as operation
istio.pilot.go.memstats.mallocs_total
(gauge)
Number of mallocs
shown as operation
istio.pilot.go.memstats.mcache_inuse_bytes
(gauge)
Number of bytes in use by mcache structures.
shown as byte
istio.pilot.go.memstats.mcache_sys_bytes
(gauge)
Number of bytes used for mcache structures obtained from system.
shown as byte
istio.pilot.go.memstats.mspan_inuse_bytes
(gauge)
Number of bytes in use by mspan structures.
shown as byte
istio.pilot.go.memstats.mspan_sys_bytes
(gauge)
Number of bytes used for mspan structures obtained from system.
shown as byte
istio.pilot.go.memstats.next_gc_bytes
(gauge)
Number of heap bytes when next garbage collection will take place
shown as byte
istio.pilot.go.memstats.other_sys_bytes
(gauge)
Number of bytes used for other system allocations
shown as byte
istio.pilot.go.memstats.stack_inuse_bytes
(gauge)
Number of bytes in use by the stack allocator
shown as byte
istio.pilot.go.memstats.stack_sys_bytes
(gauge)
Number of bytes obtained from system for stack allocator
shown as byte
istio.pilot.go.memstats.sys_bytes
(gauge)
Number of bytes obtained from system
shown as byte
istio.pilot.go.threads
(gauge)
Number of OS threads created.
shown as thread
istio.pilot.process.cpu_seconds_total
(gauge)
Total user and system CPU time spent in seconds.
shown as second
istio.pilot.process.max_fds
(gauge)
Maximum number of open file descriptors.
shown as file
istio.pilot.process.open_fds
(gauge)
Number of open file descriptors.
shown as file
istio.pilot.process.resident_memory_bytes
(gauge)
Resident memory size in bytes.
shown as byte
istio.pilot.process.start_time_seconds
(gauge)
Start time of the process since unix epoch in seconds.
shown as second
istio.pilot.process.virtual_memory_bytes
(gauge)
Virtual memory size in bytes.
shown as byte
istio.pilot.conflict.inbound_listener
(gauge)
Number of conflicting inbound listeners.
istio.pilot.conflict.outbound_listener.http_over_current_tcp
(gauge)
Number of conflicting wildcard http listeners with current wildcard tcp listener.
istio.pilot.conflict.outbound_listener.tcp_over_current_http
(gauge)
Number of conflicting wildcard tcp listeners with current wildcard http listener.
istio.pilot.conflict.outbound_listener.tcp_over_current_tcp
(gauge)
Number of conflicting tcp listeners with current tcp listener.
istio.pilot.destrule_subsets
(gauge)
Duplicate subsets across destination rules for same host.
istio.pilot.duplicate_envoy_clusters
(gauge)
Duplicate envoy clusters caused by service entries with same hostname.
istio.pilot.eds_no_instances
(gauge)
Number of clusters without instances.
istio.pilot.endpoint_not_ready
(gauge)
Endpoint found in unready state.
istio.pilot.invalid_out_listeners
(gauge)
Number of invalid outbound listeners.
istio.pilot.mcp_sink.reconnections
(gauge)
The number of times the sink has reconnected.
istio.pilot.mcp_sink.recv_failures_total
(gauge)
The number of recv failures in the source.
istio.pilot.mcp_sink.request_acks_total
(gauge)
The number of request acks received by the source.
istio.pilot.no_ip
(gauge)
Pods not found in the endpoint table, possibly invalid.
istio.pilot.proxy_convergence_time
(gauge)
Delay between config change and all proxies converging.
shown as second
istio.pilot.rds_expired_nonce
(gauge)
Total number of RDS messages with an expired nonce.
istio.pilot.services
(gauge)
Total services known to pilot.
istio.pilot.total_xds_internal_errors
(gauge)
Total number of internal XDS errors in pilot.
istio.pilot.total_xds_rejects
(gauge)
Total number of XDS responses from pilot rejected by proxy.
istio.pilot.virt_services
(gauge)
Total virtual services known to pilot.
istio.pilot.vservice_dup_domain
(gauge)
Virtual services with dup domains.
istio.pilot.xds
(gauge)
Number of endpoints connected to this pilot using XDS.
istio.pilot.xds.eds_instances
(gauge)
Instances for each cluster, as of last push.
istio.pilot.xds.push.context_errors
(gauge)
Number of errors (timeouts) initiating push context.
istio.pilot.xds.push.timeout
(gauge)
Pilot push timeout, will retry.
istio.pilot.xds.push.timeout_failures
(gauge)
Pilot push timeout failures after repeated attempts.
istio.pilot.xds.pushes
(gauge)
Pilot build and send errors for lds, rds, cds and eds.
istio.pilot.xds.write_timeout
(gauge)
Pilot XDS response write timeouts.
istio.galley.go.gc_duration_seconds.count
(gauge)
Count of the GC invocation durations.
shown as second
istio.galley.go.gc_duration_seconds.quantile
(gauge)
Quantile of the GC invocation durations.
shown as second
istio.galley.go.gc_duration_seconds.sum
(gauge)
Sum of the GC invocation durations.
shown as second
istio.galley.go.goroutines
(gauge)
Number of goroutines that currently exist.
shown as thread
istio.galley.go.info
(gauge)
Information about the Go environment.
istio.galley.go.memstats.alloc_bytes
(gauge)
Number of bytes allocated and still in use.
shown as byte
istio.galley.go.memstats.alloc_bytes_total
(gauge)
Total number of bytes allocated even if freed.
shown as byte
istio.galley.go.memstats.buck_hash_sys_bytes
(gauge)
Number of bytes used by the profiling bucket hash table.
shown as byte
istio.galley.go.memstats.frees_total
(gauge)
Total number of frees.
istio.galley.go.memstats.gc_cpu_fraction
(gauge)
CPU taken up by GC
shown as percent
istio.galley.go.memstats.gc_sys_bytes
(gauge)
Number of bytes used for garbage collection system metadata.
shown as byte
istio.galley.go.memstats.heap_alloc_bytes
(gauge)
Bytes allocated to the heap
shown as byte
istio.galley.go.memstats.heap_idle_bytes
(gauge)
Number of idle bytes in the heap
shown as byte
istio.galley.go.memstats.heap_inuse_bytes
(gauge)
Number of Bytes in the heap
shown as byte
istio.galley.go.memstats.heap_objects
(gauge)
Number of objects in the heap
shown as object
istio.galley.go.memstats.heap_released_bytes
(gauge)
Number of bytes released to the system in the last gc
shown as byte
istio.galley.go.memstats.heap_sys_bytes
(gauge)
Number of bytes used by the heap
shown as byte
istio.galley.go.memstats.last_gc_time_seconds
(gauge)
Length of last GC
shown as second
istio.galley.go.memstats.lookups_total
(gauge)
Number of lookups
shown as operation
istio.galley.go.memstats.mallocs_total
(gauge)
Number of mallocs
shown as operation
istio.galley.go.memstats.mcache_inuse_bytes
(gauge)
Number of bytes in use by mcache structures.
shown as byte
istio.galley.go.memstats.mcache_sys_bytes
(gauge)
Number of bytes used for mcache structures obtained from system.
shown as byte
istio.galley.go.memstats.mspan_inuse_bytes
(gauge)
Number of bytes in use by mspan structures.
shown as byte
istio.galley.go.memstats.mspan_sys_bytes
(gauge)
Number of bytes used for mspan structures obtained from system.
shown as byte
istio.galley.go.memstats.next_gc_bytes
(gauge)
Number of heap bytes when next garbage collection will take place
shown as byte
istio.galley.go.memstats.other_sys_bytes
(gauge)
Number of bytes used for other system allocations
shown as byte
istio.galley.go.memstats.stack_inuse_bytes
(gauge)
Number of bytes in use by the stack allocator
shown as byte
istio.galley.go.memstats.stack_sys_bytes
(gauge)
Number of bytes obtained from system for stack allocator
shown as byte
istio.galley.go.memstats.sys_bytes
(gauge)
Number of bytes obtained from system
shown as byte
istio.galley.go.threads
(gauge)
Number of OS threads created.
shown as thread
istio.galley.process.cpu_seconds_total
(gauge)
Total user and system CPU time spent in seconds.
shown as second
istio.galley.process.max_fds
(gauge)
Maximum number of open file descriptors.
shown as file
istio.galley.process.open_fds
(gauge)
Number of open file descriptors.
shown as file
istio.galley.process.resident_memory_bytes
(gauge)
Resident memory size in bytes.
shown as byte
istio.galley.process.start_time_seconds
(gauge)
Start time of the process since unix epoch in seconds.
shown as second
istio.galley.process.virtual_memory_bytes
(gauge)
Virtual memory size in bytes.
shown as byte
istio.galley.endpoint_no_pod
(gauge)
Endpoints without an associated pod.
istio.galley.mcp_source.clients_total
(gauge)
The number of streams currently connected.
istio.galley.mcp_source.message_size_bytes
(gauge)
Size of messages received from clients.
shown as byte
istio.galley.mcp_source.request_acks_total
(gauge)
The number of request acks received by the source.
istio.galley.runtime_processor.event_span_duration_milliseconds
(gauge)
The duration between each incoming event.
shown as millisecond
istio.galley.runtime_processor.events_processed_total
(gauge)
The number of events that have been processed.
istio.galley.runtime_processor.snapshot_events_total
(gauge)
The number of events per snapshot.
istio.galley.runtime_processor.snapshot_lifetime_duration_milliseconds
(gauge)
The duration of each snapshot.
shown as millisecond
istio.galley.runtime_processor.snapshots_published_total
(gauge)
The number of snapshots that have been published.
istio.galley.runtime_state_type_instances_total
(gauge)
The number of type instances per type URL.
istio.galley.runtime_strategy.on_change_total
(gauge)
The number of times the strategy's onChange has been called.
istio.galley.runtime_strategy.timer_max_time_reached_total
(gauge)
The number of times the max time has been reached.
istio.galley.runtime_strategy.quiesce_reached_total
(gauge)
The number of times a quiesce has been reached.
istio.galley.runtime_strategy.timer_resets_total
(gauge)
The number of times the timer has been reset.
istio.galley.source_kube.dynamic_converter_success_total
(gauge)
The number of times a dynamic kubernetes source successfully converted a resource.
istio.galley.source_kube.event_success_total
(gauge)
The number of times a kubernetes source successfully handled an event.
istio.galley.validation.cert_key_updates
(gauge)
Galley validation webhook certificate updates.
istio.galley.validation.config_load
(gauge)
k8s webhook configuration (re)loads.
istio.galley.validation.config_update
(gauge)
k8s webhook configuration updates.
istio.galley.validation.passed
(gauge)
Resource is valid.

Events

The Istio check does not include any events.

Service Checks

The Istio check does not include any service checks.

Troubleshooting

Need help? Contact Datadog support.

Further Reading

Additional helpful documentation, links, and articles:


Mistake in the docs? Feel free to contribute!