Monitor your primary and standby HDFS NameNodes to know when your cluster falls into a precarious state: when you’re down to one NameNode remaining, or when it’s time to add more capacity to the cluster. This Agent check collects metrics for remaining capacity, corrupt/missing blocks, dead DataNodes, filesystem load, under-replicated blocks, total volume failures (across all DataNodes), and many more.
Use this check (hdfs_namenode) and its counterpart check (hdfs_datanode), not the older two-in-one check (hdfs); that check is deprecated.
Setup
Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.
Installation
The HDFS NameNode check is included in the Datadog Agent package, so you don’t need to install anything else on your NameNodes.
Configuration
Connect the Agent
Host
To configure this check for an Agent running on a host:
init_config:instances:## @param hdfs_namenode_jmx_uri - string - required## The HDFS NameNode check retrieves metrics from the HDFS NameNode's JMX## interface via HTTP(S) (not a JMX remote connection). This check must be installed on## a HDFS NameNode. The HDFS NameNode JMX URI is composed of the NameNode's hostname and port.#### The hostname and port can be found in the hdfs-site.xml conf file under## the property dfs.namenode.http-address## https://hadoop.apache.org/docs/r3.1.3/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml#- hdfs_namenode_jmx_uri:http://localhost:9870