---
title: Cloudera
description: Cloudera
breadcrumbs: Docs > Integrations > Cloudera
---

# Cloudera
Supported OS Integration version3.5.0
## Overview{% #overview %}

This integration monitors your [Cloudera Data Platform](https://www.cloudera.com/products/cloudera-data-platform.html) through the Datadog Agent, allowing you to submit metrics and service checks on the health of your Cloudera Data Hub clusters, hosts, and roles.

**Minimum Agent version:** 7.42.0

## Setup{% #setup %}

Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the [Autodiscovery Integration Templates](https://docs.datadoghq.com/agent/kubernetes/integrations/) for guidance on applying these instructions.

### Installation{% #installation %}

The Cloudera check is included in the [Datadog Agent](https://app.datadoghq.com/account/settings/agent/latest) package. No additional installation is needed on your server.

### Configuration{% #configuration %}

#### Requirements{% #requirements %}

The Cloudera check requires version 7 of Cloudera Manager.

#### Prepare Cloudera Manager{% #prepare-cloudera-manager %}

As a best practice, Datadog recommends creating the machine user with read-only access to limit the permissions granted to the Datadog Agent.

1. In Cloudera Data Platform, navigate to the Management Console and click on the **User Management** tab. 

1. Click on **Actions**, then **Create Machine User** to create the machine user that queries the Cloudera Manager through the Datadog Agent. 

1. If the workload password hasn't been set, click on **Set Workload Password** after the user is created.

{% tab title="Host" %}
#### Host{% #host %}

1. Edit the `cloudera.d/conf.yaml` file, in the `conf.d/` folder at the root of your Agent's configuration directory to start collecting your Cloudera cluster and host data. See the [sample cloudera.d/conf.yaml](https://github.com/DataDog/integrations-core/blob/master/cloudera/datadog_checks/cloudera/data/conf.yaml.example) for all available configuration options.**Note**: The `api_url` should contain the API version at the end.

   ```yaml
   init_config:
   
      ## @param workload_username - string - required
      ## The Workload username. This value can be found in the `User Management` tab of the Management 
      ## Console in the `Workload User Name`.
      #
      workload_username: <WORKLOAD_USERNAME>
   
      ## @param workload_password - string - required
      ## The Workload password. This value can be found in the `User Management` tab of the Management 
      ## Console in the `Workload Password`.
      #
      workload_password: <WORKLOAD_PASSWORD>
   
   ## Every instance is scheduled independently of the others.
   #
   instances:
   
      ## @param api_url - string - required
      ## The URL endpoint for the Cloudera Manager API. This can be found under the Endpoints tab for 
      ## your Data Hub to monitor. 
      ##
      ## Note: The version of the Cloudera Manager API needs to be appended at the end of the URL. 
      ## For example, using v48 of the API for Data Hub `cluster_1` should result with a URL similar 
      ## to the following:
      ## `https://cluster1.cloudera.site/cluster_1/cdp-proxy-api/cm-api/v48`
      #
      - api_url: <API_URL>
   ```

1. [Restart the Agent](https://docs.datadoghq.com/agent/guide/agent-commands/#start-stop-and-restart-the-agent) to start collecting and sending Cloudera Data Hub cluster data to Datadog.

{% /tab %}

{% tab title="Containerized" %}
#### Containerized{% #containerized %}

For containerized environments, see the [Autodiscovery Integration Templates](https://docs.datadoghq.com/agent/kubernetes/integrations/) for guidance on applying the parameters below.

| Parameter            | Value                                                                                      |
| -------------------- | ------------------------------------------------------------------------------------------ |
| `<INTEGRATION_NAME>` | `cloudera`                                                                                 |
| `<INIT_CONFIG>`      | `{"workload_username": "<WORKLOAD_USERNAME>", 'workload_password": "<WORKLOAD_PASSWORD>"}` |
| `<INSTANCE_CONFIG>`  | `{"api_url": <API_URL>"}`                                                                  |

{% /tab %}

#### Clusters Discovery{% #clusters-discovery %}

You can configure how your clusters are discovered with the `clusters` configuration option with the following parameters:

- {% dl %}
  
  {% dt %}
`limit`
  {% /dt %}

  {% dd %}
  Maximum number of items to be autodiscovered.**Default value**: `None` (all clusters will be processed)
    {% /dd %}

    {% /dl %}
- {% dl %}
  
  {% dt %}
`include`
  {% /dt %}

  {% dd %}
  Mapping of regular expression keys and component config values to autodiscover.**Default value**: empty map
    {% /dd %}

    {% /dl %}
- {% dl %}
  
  {% dt %}
`exclude`
  {% /dt %}

  {% dd %}
  List of regular expressions with the patterns of components to exclude from autodiscovery.**Default value**: empty list
    {% /dd %}

    {% /dl %}
- {% dl %}
  
  {% dt %}
`interval`
  {% /dt %}

  {% dd %}
  Validity time in seconds of the last list of clusters obtained through the endpoint.**Default value**: `None` (no cache used)
    {% /dd %}

    {% /dl %}

**Examples**:

Process a maximum of `5` clusters with names that start with `my_cluster`:

```yaml
clusters:
  limit: 5
  include:
    - 'my_cluster.*'
```

Process a maximum of `20` clusters and exclude those with names that start with `tmp_`:

```yaml
clusters:
  limit: 20
  include:
    - '.*'
  exclude:
    - 'tmp_.*'
```

#### Custom Queries{% #custom-queries %}

You can configure the Cloudera integration to collect custom metrics that are not be collected by default by running custom timeseries queries. These queries use [the tsquery language](https://docs.cloudera.com/cloudera-manager/7.9.0/monitoring-and-diagnostics/topics/cm-tsquery-syntax.html) to retrieve data from Cloudera Manager.

**Example**:

Collect JVM garbage collection rate and JVM free memory with `cloudera_jvm` as a custom tag:

```yaml
custom_queries:
- query: select last(jvm_gc_rate) as jvm_gc_rate, last(jvm_free_memory) as jvm_free_memory
  tags: cloudera_jvm
```

Note: These queries can take advantage of metric expressions, resulting in queries such as `total_cpu_user + total_cpu_system`, `1000 * jvm_gc_time_ms / jvm_gc_count`, and `max(total_cpu_user)`. When using metric expressions, make sure to also include aliases for the metrics, otherwise the metric names may be incorrectly formatted. For example, `SELECT last(jvm_gc_count)` results in the metric `cloudera.<CATEGORY>.last_jvm_gc_count`. You can append an alias like in the following example: `SELECT last(jvm_gc_count) as jvm_gc_count` to generate the metric `cloudera.<CATEGORY>.jvm_gc_count`.

### Validation{% #validation %}

[Run the Agent's status subcommand](https://docs.datadoghq.com/agent/guide/agent-commands/#agent-status-and-information) and look for `cloudera` under the Checks section.

## Data Collected{% #data-collected %}

### Metrics{% #metrics %}

|  |
|  |
| **cloudera.cluster.cpu\_percent\_across\_hosts**(gauge)                               | Percent of the Host CPU Usage metric computed across all this entity's descendant Host entities*Shown as percent*              |
| **cloudera.cluster.total\_bytes\_receive\_rate\_across\_network\_interfaces**(gauge)  | The sum of the Bytes Received metric computed across all this entity's descendant Network Interface entities*Shown as byte*    |
| **cloudera.cluster.total\_bytes\_transmit\_rate\_across\_network\_interfaces**(gauge) | The sum of the Bytes Transmitted metric computed across all this entity's descendant Network Interface entities*Shown as byte* |
| **cloudera.cluster.total\_read\_bytes\_rate\_across\_disks**(gauge)                   | The sum of the Disk Bytes Read metric computed across all this entity's descendant Disk entities*Shown as byte*                |
| **cloudera.cluster.total\_write\_bytes\_rate\_across\_disks**(gauge)                  | The sum of the Disk Bytes Written metric computed across all this entity's descendant Disk entities*Shown as byte*             |
| **cloudera.disk.await\_read\_time**(gauge)                                            | The average disk await read time of the entity*Shown as millisecond*                                                           |
| **cloudera.disk.await\_time**(gauge)                                                  | The average disk await time of the entity*Shown as millisecond*                                                                |
| **cloudera.disk.await\_write\_time**(gauge)                                           | The average disk await write time of the entity*Shown as millisecond*                                                          |
| **cloudera.disk.service\_time**(gauge)                                                | The average disk service time of the entity*Shown as millisecond*                                                              |
| **cloudera.host.alerts\_rate**(gauge)                                                 | The number of alerts per second*Shown as event*                                                                                |
| **cloudera.host.cpu\_iowait\_rate**(gauge)                                            | Total CPU iowait time                                                                                                          |
| **cloudera.host.cpu\_irq\_rate**(gauge)                                               | Total CPU IRQ time                                                                                                             |
| **cloudera.host.cpu\_nice\_rate**(gauge)                                              | Total CPU nice time                                                                                                            |
| **cloudera.host.cpu\_soft\_irq\_rate**(gauge)                                         | Total CPU soft IRQ time                                                                                                        |
| **cloudera.host.cpu\_steal\_rate**(gauge)                                             | Stolen time, which is the time spent in other operating systems when running in a virtualized environment                      |
| **cloudera.host.cpu\_system\_rate**(gauge)                                            | Total System CPU                                                                                                               |
| **cloudera.host.cpu\_user\_rate**(gauge)                                              | Total CPU user time                                                                                                            |
| **cloudera.host.events\_critical\_rate**(gauge)                                       | The number of critical events                                                                                                  |
| **cloudera.host.events\_important\_rate**(gauge)                                      | The number of important events                                                                                                 |
| **cloudera.host.health\_bad\_rate**(gauge)                                            | Percentage of Time with Bad Health                                                                                             |
| **cloudera.host.health\_concerning\_rate**(gauge)                                     | Percentage of Time with Concerning Health                                                                                      |
| **cloudera.host.health\_disabled\_rate**(gauge)                                       | Percentage of Time with Disabled Health                                                                                        |
| **cloudera.host.health\_good\_rate**(gauge)                                           | Percentage of Time with Good Health                                                                                            |
| **cloudera.host.health\_unknown\_rate**(gauge)                                        | Percentage of Time with Unknown Health                                                                                         |
| **cloudera.host.load\_1**(gauge)                                                      | Load Average over 1 minute                                                                                                     |
| **cloudera.host.load\_15**(gauge)                                                     | Load Average over 15 minutes                                                                                                   |
| **cloudera.host.load\_5**(gauge)                                                      | Load Average over 5 minutes                                                                                                    |
| **cloudera.host.num\_cores**(gauge)                                                   | Total number of cores                                                                                                          |
| **cloudera.host.num\_physical\_cores**(gauge)                                         | Total number of physical cores                                                                                                 |
| **cloudera.host.physical\_memory\_buffers**(gauge)                                    | The amount of physical memory devoted to temporary storage for raw disk blocks*Shown as byte*                                  |
| **cloudera.host.physical\_memory\_cached**(gauge)                                     | The amount of physical memory used for files read from the disk. This is commonly referred to as the pagecache*Shown as byte*  |
| **cloudera.host.physical\_memory\_total**(gauge)                                      | The total physical memory available*Shown as byte*                                                                             |
| **cloudera.host.physical\_memory\_used**(gauge)                                       | The total amount of memory being used, excluding buffers and cache*Shown as byte*                                              |
| **cloudera.host.swap\_out\_rate**(gauge)                                              | Memory swapped out to disk*Shown as page*                                                                                      |
| **cloudera.host.swap\_used**(gauge)                                                   | Swap used*Shown as byte*                                                                                                       |
| **cloudera.host.total\_bytes\_receive\_rate\_across\_network\_interfaces**(gauge)     | The sum of the Bytes Received metric computed across all this entity's descendant Network Interface entities*Shown as byte*    |
| **cloudera.host.total\_bytes\_transmit\_rate\_across\_network\_interfaces**(gauge)    | The sum of the Bytes Transmitted metric computed across all this entity's descendant Network Interface entities*Shown as byte* |
| **cloudera.host.total\_phys\_mem\_bytes**(gauge)                                      | Total physical memory in bytes*Shown as byte*                                                                                  |
| **cloudera.host.total\_read\_bytes\_rate\_across\_disks**(gauge)                      | The sum of the Disk Bytes Read metric computed across all this entity's descendant Disk entities*Shown as byte*                |
| **cloudera.host.total\_read\_ios\_rate\_across\_disks**(gauge)                        | The sum of the Disk Reads metric computed across all this entity's descendant Disk entities*Shown as operation*                |
| **cloudera.host.total\_write\_bytes\_rate\_across\_disks**(gauge)                     | The sum of the Disk Bytes Written metric computed across all this entity's descendant Disk entities*Shown as byte*             |
| **cloudera.host.total\_write\_ios\_rate\_across\_disks**(gauge)                       | The sum of the Disk Writes metric computed across all this entity's descendant Disk entities*Shown as operation*               |
| **cloudera.role.cpu\_system\_rate**(gauge)                                            | Total System CPU                                                                                                               |
| **cloudera.role.cpu\_user\_rate**(gauge)                                              | Total CPU user time                                                                                                            |
| **cloudera.role.mem\_rss**(gauge)                                                     | Resident memory used*Shown as byte*                                                                                            |

### Events{% #events %}

The Cloudera integration collects events that are emitted from the `/events` endpoint from the Cloudera Manager API. The event levels are mapped as the following:

| Cloudera        | Datadog |
| --------------- | ------- |
| `UNKNOWN`       | `error` |
| `INFORMATIONAL` | `info`  |
| `IMPORTANT`     | `info`  |
| `CRITICAL`      | `error` |

### Service Checks{% #service-checks %}

**cloudera.can\_connect**

Returns `OK` if the check is able to connect to the Cloudera Manager API and collect metrics, `CRITICAL` otherwise.

*Statuses: ok, critical*

**cloudera.cluster.health**

Returns `OK` if the cluster is in good health or is starting, `WARNING` if the cluster is stopping or the health is concerning, `CRITICAL` if the cluster is down or in bad health, and `UNKNOWN` otherwise.

*Statuses: ok, critical, warning, unknown*

**cloudera.host.health**

Returns `OK` if the host is in good health or is starting, `WARNING` if the host is stopping or the health is concerning, `CRITICAL` if the host is down or in bad health, and `UNKNOWN` otherwise.

*Statuses: ok, critical, warning, unknown*

## Troubleshooting{% #troubleshooting %}

### Collecting metrics of Datadog integrations on Cloudera hosts{% #collecting-metrics-of-datadog-integrations-on-cloudera-hosts %}

To install the Datadog Agent on a Cloudera host, make sure that the security group associated with the host allows SSH access. Then, you need to use the [root user `cloudbreak`](https://docs.cloudera.com/data-hub/cloud/access-clusters/topics/mc-accessing-cluster-via-ssh.html) when accessing the host with the SSH key generated during the environment creation:

```
sudo ssh -i "/path/to/key.pem" cloudbreak@<HOST_IP_ADDRESS>
```

The workload username and password can be used to access Cloudera hosts through SSH, although only the `cloudbreak` user can install the Datadog Agent. Trying to use any user that is not `cloudbreak` may result in the following error:

```
<NON_CLOUDBREAK_USER> is not allowed to run sudo on <CLOUDERA_HOSTNAME>.  This incident will be reported.
```

### Config errors when collecting Datadog metrics{% #config-errors-when-collecting-datadog-metrics %}

If you see something similar to the following in the Agent status when collecting metrics from your Cloudera host:

```zed
  Config Errors
  ==============
    zk
    --
      open /etc/datadog-agent/conf.d/zk.d/conf.yaml: permission denied
```

You need to change the ownership of the `conf.yaml` to `dd-agent`:

```
[cloudbreak@<CLOUDERA_HOSTNAME> ~]$ sudo chown -R dd-agent:dd-agent /etc/datadog-agent/conf.d/zk.d/conf.yaml
```

Need help? Contact [Datadog support](https://docs.datadoghq.com/help/).

## Further Reading{% #further-reading %}

- [Gain visibility into your Cloudera clusters with Datadog](https://www.datadoghq.com/blog/cloudera-integration-announcement/)
