Network Performance Monitoring is now generally available! Network Monitoring is now available!

Amazon Managed Streaming for Kafka

Crawler Crawler

Overview

Amazon Managed Streaming for Kafka (MSK) is a fully managed service that makes it easy to build and run applications that use Apache Kafka to process streaming data.

Enable this integration to see all your MSK metrics in Datadog.

Setup

Installation

If you haven’t already, set up the Amazon Web Services integration first.

Metric collection

  1. In the AWS integration tile, ensure that MSK is checked under metric collection.

  2. Install the Datadog - Amazon MSK integration.

Log collection

Enable logging

Configure Amazon MSK to send logs either to a S3 bucket or to Cloudwatch.

Note: If you log to a S3 bucket, make sure that amazon_msk is set as Target prefix.

Send logs to Datadog

  1. If you haven’t already, set up the Datadog log collection AWS Lambda function.
  2. Once the lambda function is installed, manually add a trigger on the S3 bucket or Cloudwatch log group that contains your Amazon MSK logs in the AWS console:

Data Collected

Metrics

aws.kafka.zookeeper_request_latency_ms_mean
(gauge)
Mean latency in milliseconds for ZooKeeper requests from broker.
aws.kafka.active_controller_count
(gauge)
Only one controller per cluster should be active at any given time.
aws.kafka.global_partition_count
(gauge)
Total number of partitions across all brokers in the cluster.
aws.kafka.global_topic_count
(gauge)
Total number of partitions across all brokers in the cluster.
aws.kafka.offline_partitions_count
(gauge)
Total number of partitions that are offline in the cluster.
aws.kafka.swap_used
(gauge)
The size in bytes of swap memory that is in use for the broker.
Shown as byte
aws.kafka.swap_free
(gauge)
The size in bytes of swap memory that is available for the broker.
Shown as byte
aws.kafka.memory_used
(gauge)
The size in bytes of memory that is in use for the broker.
Shown as byte
aws.kafka.memory_buffered
(gauge)
The size in bytes of buffered memory for the broker.
Shown as byte
aws.kafka.memory_free
(gauge)
The size in bytes of memory that is free and available for the broker.
Shown as byte
aws.kafka.memory_cached
(gauge)
The size in bytes of cached memory for the broker.
Shown as byte
aws.kafka.cpu_user
(gauge)
The percentage of CPU in user space.
Shown as percent
aws.kafka.cpu_system
(gauge)
The percentage of CPU in kernel space.
Shown as percent
aws.kafka.cpu_idle
(gauge)
The percentage of CPU idle time.
Shown as percent
aws.kafka.root_disk_used
(gauge)
The percentage of the root disk used by the broker.
Shown as percent
aws.kafka.kafka_app_logs_disk_used
(gauge)
The percentage of disk space used for application logs.
Shown as percent
aws.kafka.kafka_data_logs_disk_used
(gauge)
The percentage of disk space used for data logs.
Shown as percent
aws.kafka.network_rx_errors
(count)
The number of network receive errors for the broker.
aws.kafka.network_tx_errors
(count)
The number of network transmit errors for the broker.
aws.kafka.network_rx_dropped
(count)
The number of dropped receive packages.
aws.kafka.network_tx_dropped
(count)
The number of dropped transmit packages.
aws.kafka.network_rx_packets
(count)
The number of packets recieved by the broker.
aws.kafka.network_tx_packets
(count)
The number of packets transmitted by the broker.
aws.kafka.messages_in_per_sec
(gauge)
The number of incoming messages per second for the broker.
aws.kafka.network_processor_avg_idle_percent
(gauge)
The average percentage of the time the network processors are idle.
aws.kafka.request_handler_avg_idle_percent
(gauge)
The average percentage of the time the request handler threads are idle.
aws.kafka.leader_count
(gauge)
The number of leader replicas.
aws.kafka.partition_count
(gauge)
The number of partitions for the broker.
aws.kafka.produce_local_time_ms_mean
(gauge)
The mean time in milliseconds for the follower to send a response.
Shown as millisecond
aws.kafka.produce_message_conversions_time_ms_mean
(gauge)
The mean time in milliseconds spent on message format conversions.
Shown as millisecond
aws.kafka.produce_request_queue_time_ms_mean
(gauge)
The mean time in milliseconds that request messages spend in the queue.
Shown as millisecond
aws.kafka.produce_response_queue_time_ms_mean
(gauge)
The mean time in milliseconds that response messages spend in the queue.
Shown as millisecond
aws.kafka.produce_response_send_time_ms_mean
(gauge)
The mean time in milliseconds spent on sending response messages.
Shown as millisecond
aws.kafka.produce_total_time_ms_mean
(gauge)
The mean produce time in milliseconds.
Shown as millisecond
aws.kafka.request_bytes_mean
(gauge)
The mean number of request bytes for the broker.
aws.kafka.under_minlsr_partition_count
(gauge)
The number of under minlsr partitions for the broker
aws.kafka.under_replicated_partitions
(gauge)
The number of under-replicated partitions for the broker.
aws.kafka.bytes_in_per_sec
(gauge)
The number of bytes per second received from clients.
Shown as byte
aws.kafka.bytes_out_per_sec
(gauge)
The number of bytes per second sent to clients.
Shown as byte
aws.kafka.messages_in_per_sec
(gauge)
The number of messages received from clients per second.
aws.kafka.fetch_message_conversions_per_sec
(gauge)
The number of fetch message conversions per second for the broker.
aws.kafka.produce_message_conversions_per_sec
(gauge)
The number of produce message conversions per second for the broker.
aws.kafka.fetch_consumer_total_time_ms_mean
(gauge)
The mean total time in milliseconds that consumers spend on fetching data from the broker.
Shown as millisecond
aws.kafka.fetch_follower_total_time_ms_mean
(gauge)
The mean total time in milliseconds that followers spend on fetching data from the broker.
Shown as millisecond
aws.kafka.fetch_consumer_request_queue_time_ms_mean
(gauge)
The mean time in milliseconds that the consumer request waits in the request queue.
Shown as millisecond
aws.kafka.fetch_follower_request_queue_time_ms_mean
(gauge)
The mean time in milliseconds that the follower request waits in the request queue.
Shown as millisecond
aws.kafka.fetch_consumer_local_time_ms_mean
(gauge)
The mean time in milliseconds that the consumer request is processed at the leader.
Shown as millisecond
aws.kafka.fetch_follower_local_time_ms_mean
(gauge)
The mean time in milliseconds that the follower request is processed at the leader.
Shown as millisecond
aws.kafka.fetch_consumer_response_queue_time_ms_mean
(gauge)
The mean time in milliseconds that the consumer request waits in the response queue.
Shown as millisecond
aws.kafka.fetch_follower_response_queue_time_ms_mean
(gauge)
The mean time in milliseconds that the follower request waits in the response queue.
Shown as millisecond
aws.kafka.consumer_response_send_time_ms_mean
(gauge)
The mean time in milliseconds for the consumer to send a response.
Shown as millisecond
aws.kafka.fetch_follower_response_send_time_ms_mean
(gauge)
The mean time in milliseconds for the follower to send a response.
Shown as millisecond
aws.kafka.produce_throttle_time
(gauge)
The average produce throttle time in milliseconds.
Shown as millisecond
aws.kafka.produce_throttle_byte_rate
(gauge)
The number of throttled bytes per second.
aws.kafka.produce_throttle_queue_size
(gauge)
The number of messages in the throttle queue.
aws.kafka.fetch_throttle_time
(gauge)
The average fetch throttle time in milliseconds.
Shown as millisecond
aws.kafka.fetch_throttle_byte_rate
(gauge)
The number of throttled bytes per second.
aws.kafka.fetch_throttle_queue_size
(gauge)
The number of messages in the throttle queue.
aws.kafka.request_throttle_time
(gauge)
The average request throttle time in milliseconds.
Shown as millisecond
aws.kafka.request_time
(gauge)
The average time spent in broker network and I/O threads to process requests that are exempt throttled.
aws.kafka.request_throttle_queue_size
(gauge)
The number of messages in the throttle queue.
aws.kafka.request_exempt_from_throttle_time
(gauge)
The average time spent in broker network and I/O threads to process requests that are exempt from throttling.

Events

The Amazon MSK integration does not include any events.

Service Checks

The Amazon MSK integration does not include any service checks.

Troubleshooting

Need help? Contact Datadog support.