Confluent Cloud

Overview

The Confluent Cloud integration is not supported for the Datadog site.

Connect Datadog with Confluent Cloud to view Kafka cluster metrics by topic and Kafka connector metrics. You can create monitors and dashboards with these metrics.

Setup

Installation

Install the integration with the Datadog Confluent Cloud integration tile.

Configuration

  1. In the integration tile, navigate to the Configuration tab.
  2. Click + Add API Key to enter your Confluent Cloud API Key and API Secret.
  3. Click Save. Datadog searches for accounts associated with those credentials.
  4. Add your Confluent Cloud Cluster ID or Connector ID. Datadog crawls the Confluent Cloud metrics and loads metrics within minutes.

API Key and secret

To create your Confluent Cloud API Key and Secret, see Add the MetricsViewer role to a new service account in the UI.

Cluster ID

To find your Confluent Cloud Cluster ID:

  1. In Confluent Cloud, navigate to Environment Overview and select the cluster you want to monitor.
  2. In the left-hand navigation, click Cluster overview > Cluster settings.
  3. Under Identification, copy the Cluster ID beginning with lkc.

Connector ID

To find your Confluent Cloud Connector ID:

  1. In Confluent Cloud, navigate to Environment Overview and select the cluster you want to monitor.
  2. In the left-hand navigation, click Data integration > Connectors.
  3. Under Connectors, copy the Connector ID beginning with lcc.

Dashboards

After configuring the integration, see the out-of-the-box Confluent Cloud dashboard for an overview of Kafka cluster and connector metrics.

By default, all metrics collected across Confluent Cloud are displayed.

Data Collected

Metrics

confluent_cloud.kafka.received_bytes
(count)
The delta count of bytes received from the network. Each sample is the number of bytes received since the previous data sample. The count is sampled every 60 seconds.
Shown as byte
confluent_cloud.kafka.sent_bytes
(count)
The delta count of bytes sent over the network. Each sample is the number of bytes sent since the previous data point. The count is sampled every 60 seconds.
Shown as byte
confluent_cloud.kafka.received_records
(count)
The delta count of records received. Each sample is the number of records received since the previous data sample. The count is sampled every 60 seconds.
Shown as record
confluent_cloud.kafka.sent_records
(count)
The delta count of records sent. Each sample is the number of records sent since the previous data point. The count is sampled every 60 seconds.
Shown as record
confluent_cloud.kafka.retained_bytes
(gauge)
The current count of bytes retained by the cluster. The count is sampled every 60 seconds.
Shown as byte
confluent_cloud.kafka.active_connection_count
(gauge)
The count of active authenticated connections.
Shown as connection
confluent_cloud.kafka.consumer_lag_offsets
(gauge)
The lag between a group member's committed offset and the partition's high watermark. This metric will be tagged with the kafka id, consumer group id, and topic id.
confluent_cloud.custom.kafka.consumer_lag_offsets
(gauge)
The lag between a group member's committed offset and the partition's high watermark. This metric will be tagged with kafka id, consumer group id, topic, consumer group member id, client id, and partition. Enabling this metric will result in the generation of custom metrics, which are billable. Each unique combination of tags will result in a single custom metric.
confluent_cloud.kafka.request_count
(count)
The delta count of requests received over the network. Each sample is the number of requests received since the previous data point. The count is sampled every 60 seconds.
Shown as request
confluent_cloud.kafka.partition_count
(gauge)
The number of partitions.
confluent_cloud.kafka.successful_authentication_count
(count)
The delta count of successful authentications. Each sample is the number of successful authentications since the previous data point. The count is sampled every 60 seconds.
Shown as attempt
confluent_cloud.kafka.cluster_link_destination_response_bytes
(count)
The delta count of cluster linking response bytes from all request types. Each sample is the number of bytes sent since the previous data point. The count is sampled every 60 seconds.
Shown as byte
confluent_cloud.kafka.cluster_link_source_response_bytes
(count)
The delta count of cluster linking source response bytes from all request types. Each sample is the number of bytes sent since the previous data point. The count is sampled every 60 seconds.
Shown as byte
confluent_cloud.kafka.cluster_active_link_count
(gauge)
The current count of active cluster links. The count is sampled every 60 seconds. The implicit time aggregation for this metric is MAX.
confluent_cloud.kafka.cluster_load_percent
(gauge)
A measure of the utilization of the cluster. The value is between 0.0 and 1.0.
Shown as percent
confluent_cloud.connect.sent_records
(count)
The delta count of total number of records sent from the transformations and written to Kafka for the source connector. Each sample is the number of records sent since the previous data point. The count is sampled every 60 seconds.
Shown as record
confluent_cloud.connect.received_records
(count)
The delta count of total number of records received by the sink connector. Each sample is the number of records received since the previous data point. The count is sampled every 60 seconds.
Shown as record
confluent_cloud.connect.sent_bytes
(count)
The delta count of total bytes sent from the transformations and written to Kafka for the source connector. Each sample is the number of bytes sent since the previous data point. The count is sampled every 60 seconds.
Shown as byte
confluent_cloud.connect.received_bytes
(count)
The delta count of total bytes received by the sink connector. Each sample is the number of bytes received since the previous data point. The count is sampled every 60 seconds.
Shown as byte
confluent_cloud.connect.dead_letter_queue_records
(count)
The delta count of dead letter queue records written to Kafka for the sink connector. The count is sampled every 60 seconds.
Shown as record
confluent_cloud.ksql.streaming_unit_count
(gauge)
The count of Confluent Streaming Units (CSUs) for this KSQL instance. The count is sampled every 60 seconds. The implicit time aggregation for this metric is MAX.
Shown as unit
confluent_cloud.ksql.query_saturation
(gauge)
The maximum saturation for a given ksqlDB query across all nodes. Returns a value between 0 and 1. A value close to 1 indicates that ksqlDB query processing is bottlenecked on available resources.
Shown as unit
confluent_cloud.ksql.task_stored_bytes
(gauge)
The size of a given task's state stores in bytes.
Shown as byte
confluent_cloud.ksql.storage_utilization
(gauge)
The total storage utilization for a given ksqlDB application.
Shown as unit
confluent_cloud.schema_registry.schema_count
(gauge)
The number of registered schemas.
confluent_cloud.schema_registry.request_count
(count)
The delta count of requests received by the schema registry server. Each sample is the number of requests received since the previous data point. The count sampled every 60 seconds.
confluent_cloud.schema_registry.schema_operations_count
(count)
The delta count of schema related operations. Each sample is the number of requests received since the previous data point. The count sampled every 60 seconds.

Events

The Confluent Cloud integration does not include any events.

Service Checks

The Confluent Cloud integration does not include any service checks.

Troubleshooting

Need help? Contact Datadog support.