
Overview
This Agent integration collects message offset metrics from your Kafka consumers. This check fetches the highwater offsets from the Kafka brokers, consumer offsets that are stored in Kafka (or Zookeeper for old-style consumers), and then calculates consumer lag (which is the difference between the broker offset and the consumer offset).
Note:
- This integration ensures that consumer offsets are checked before broker offsets; in the worst case, consumer lag may be a little overstated. Checking these offsets in the reverse order can understate consumer lag to the point of having negative values, which is a dire scenario usually indicating messages are being skipped.
- If you want to collect JMX metrics from your Kafka brokers or Java-based consumers/producers, see the Kafka Broker integration.
Setup
Installation
The Agent’s Kafka consumer check is included in the Datadog Agent package. No additional installation is needed on your Kafka nodes.
Configuration
Containerized
Configure this check on a container running the Kafka Consumer.
See the Autodiscovery Integration Templates for guidance on applying the parameters below.
In Kubernetes, if a single consumers is running on many containers, you can setup this check as a Cluster Check to avoid having multiple checks collecting the same metrics.
Parameter | Value |
---|
<INTEGRATION_NAME> | kafka_consumer |
<INIT_CONFIG> | blank or {} |
<INSTANCE_CONFIG> | {"kafka_connect_str": "<KAFKA_CONNECT_STR>", "consumer_groups": {"<CONSUMER_NAME>": {}}} For example, {"kafka_connect_str": "server:9092", "consumer_groups": {"my_consumer_group": {}}} |
Configure this check on a host running the Kafka Consumer.
Avoid having multiple Agents running with the same check configuration, as this puts additional pressure on your Kafka cluster.
- Edit the
kafka_consumer.d/conf.yaml
file, in the conf.d/
folder at the root of your Agent’s configuration directory. See the sample kafka_consumer.d/conf.yaml for all available configuration options. A minimal setup is:
instances:
- kafka_connect_str: <KAFKA_CONNECT_STR>
consumer_groups:
# Monitor all topics for consumer <CONSUMER_NAME>
<CONSUMER_NAME>: {}
- Restart the Agent.
Validation
- Run the Agent’s status subcommand and look for
kafka_consumer
under the Checks section. - Ensure the metric
kafka.consumer_lag
is generated for the appropriate consumer_group
.
Data Collected
Metrics
| |
---|
kafka.broker_offset (gauge) | Current message offset on broker. Shown as offset |
kafka.consumer_lag (gauge) | Lag in messages between consumer and broker. Shown as offset |
kafka.consumer_offset (gauge) | Current message offset on consumer. Shown as offset |
kafka.estimated_consumer_lag (gauge) | Lag in seconds between consumer and broker. This metric is provided through Data Streams Monitoring. Additional charges may apply. Shown as second |
Kafka messages
This integration is used by Data Streams Monitoring to retrieve messages from Kafka on demand.
Events
consumer_lag:
The Datadog Agent emits an event when the value of the consumer_lag
metric goes below 0, tagging it with topic
, partition
and consumer_group
.
Service Checks
The Kafka-consumer check does not include any service checks.
Troubleshooting
Kerberos GSSAPI Authentication
Depending on your Kafka cluster’s Kerberos setup, you may need to configure the following:
- Kafka client configured for the Datadog Agent to connect to the Kafka broker. The Kafka client should be added as a Kerberos principal and added to a Kerberos keytab. The Kafka client should also have a valid kerberos ticket.
- TLS certificate to authenticate a secure connection to the Kafka broker.
- If JKS keystore is used, a certificate needs to be exported from the keystore and the file path should be configured with the applicable
tls_cert
and/or tls_ca_cert
options. - If a private key is required to authenticate the certificate, it should be configured with the
tls_private_key
option. If applicable, the private key password should be configured with the tls_private_key_password
.
KRB5_CLIENT_KTNAME
environment variable pointing to the Kafka client’s Kerberos keytab location if it differs from the default path (for example, KRB5_CLIENT_KTNAME=/etc/krb5.keytab
)KRB5CCNAME
environment variable pointing to the Kafka client’s Kerberos credentials ticket cache if it differs from the default path (for example, KRB5CCNAME=/tmp/krb5cc_xxx
)- If the Datadog Agent is unable to access the environment variables, configure the environment variables in a Datadog Agent service configuration override file for your operating system. The procedure for modifying the Datadog Agent service unit file may vary for different Linux operating systems. For example, in a Linux
systemd
environment:
Linux Systemd Example
- Configure the environment variables in an environment file.
For example:
/path/to/environment/file
KRB5_CLIENT_KTNAME=/etc/krb5.keytab
KRB5CCNAME=/tmp/krb5cc_xxx
Create a Datadog Agent service configuration override file: sudo systemctl edit datadog-agent.service
Configure the following in the override file:
[Service]
EnvironmentFile=/path/to/environment/file
- Run the following commands to reload the systemd daemon, datadog-agent service, and Datadog Agent:
sudo systemctl daemon-reload
sudo systemctl restart datadog-agent.service
sudo service datadog-agent restart
Further Reading