New announcements for Serverless, Network, RUM, and more from Dash! New announcements from Dash!

Ambari

Agent Check Agent Check

Supported OS: Linux Mac OS

Overview

This check monitors Ambari through the Datadog Agent.

Setup

Installation

The Ambari check is included in the Datadog Agent package. No additional installation is needed on your server.

Configuration

  1. Edit the ambari.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your Ambari performance data. See the sample ambari.d/conf.yaml for all available configuration options.

  2. Restart the Agent.

Log Collection

To enable collecting logs in the Datadog Agent, update logs_enabled in datadog.yaml:

    logs_enabled: true

Next, edit ambari.d/conf.yaml by uncommenting the logs lines at the bottom. Update the logs path with the correct path to your Ambari log files.

 logs:
   - type: file
     path: /var/log/ambari-server/ambari-alerts.log
     source: ambari
     service: ambari
     log_processing_rules:
        - type: multi_line
          name: new_log_start_with_date
          pattern: \d{4}\-(0?[1-9]|1[012])\-(0?[1-9]|[12][0-9]|3[01])  # 2019-04-22 15:47:00,999
...

Validation

Run the Agent’s status subcommand and look for ambari under the Checks section.

Data Collected

This integration collects for every host in every cluster the following system metrics:

  • boottime
  • cpu
  • disk
  • memory
  • load
  • network
  • process

If service metrics collection is enabled with collect_service_metrics this integration collects for each whitelisted service component the metrics with headers in the white list.

Metrics

ambari.boottime
(gauge)
Host boot time.
shown as millisecond
ambari.cpu.cpu_idle
(gauge)
Host Idle CPU.
shown as percent
ambari.cpu.cpu_nice
(gauge)
Host Nice CPU.
shown as percent
ambari.cpu.cpu_num
(gauge)
Host Idle CPU.
ambari.cpu.cpu_system
(gauge)
Host System CPU.
shown as percent
ambari.cpu.cpu_user
(gauge)
Host User CPU.
shown as percent
ambari.cpu.cpu_wio
(gauge)
Host CPU waiting for IO.
shown as percent
ambari.disk.disk_free
(gauge)
Free disk space.
shown as byte
ambari.disk.disk_total
(gauge)
Total disk size.
shown as byte
ambari.disk.read_bytes
(gauge)
Read bytes.
shown as byte
ambari.disk.read_count
(gauge)
Read count.
ambari.disk.read_time
(gauge)
Disk read time.
shown as millisecond
ambari.disk.write_bytes
(gauge)
Written bytes.
shown as byte
ambari.disk.write_count
(gauge)
Written count.
ambari.disk.write_time
(gauge)
Disk write time.
shown as millisecond
ambari.load_fifteen
(gauge)
Load fifteen.
shown as percent
ambari.load_five
(gauge)
Load Five.
shown as percent
ambari.load_one
(gauge)
Load one.
shown as percent
ambari.memory.mem_cached
(gauge)
Cached Memory.
shown as byte
ambari.memory.mem_free
(gauge)
Free Memory.
shown as byte
ambari.memory.mem_shared
(gauge)
Shared Memory.
shown as byte
ambari.memory.mem_total
(gauge)
Total Memory
shown as byte
ambari.memory.swap_free
(gauge)
Free Swap
shown as byte
ambari.memory.swap_total
(gauge)
Total Swap
shown as byte
ambari.network.bytes_in
(gauge)
Network bytes in.
shown as byte
ambari.network.bytes_out
(gauge)
Network bytes out.
shown as byte
ambari.network.pkts_in
(gauge)
Network packets in.
shown as byte
ambari.network.pkts_out
(gauge)
Network packets out.
shown as byte
ambari.process.proc_run
(gauge)
Process run.
ambari.process.proc_total
(gauge)
Process total.

Service Checks

ambari.can_connect:
Returns OK if the cluster is reachable, CRITICAL otherwise.

ambari.state:
Returns OK if the service is installed or running, WARNING if the service is stopping or uninstalling, or CRITICAL if the service is uninstalled or stopped. For a complete enumeration, see this file.

Events

Ambari does not include any events.

Troubleshooting

Need help? Contact Datadog support.


Mistake in the docs? Feel free to contribute!