Map Reduce

Docs > Integrations > Map Reduce

Supported OS Linux Windows Mac OS

Integration version7.0.0

MapReduce Dashboard

Overview

Get metrics from mapreduce service in real time to:

Visualize and monitor mapreduce states
Be notified about mapreduce failovers and events.

Setup

Installation

The Mapreduce check is included in the Datadog Agent package, so you don’t need to install anything else on your servers.

Configuration

Host

To configure this check for an Agent running on a host:

Edit the mapreduce.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to point to your server and port, set the masters to monitor. See the sample mapreduce.d/conf.yaml for all available configuration options.
Restart the Agent.

Log collection

Collecting logs is disabled by default in the Datadog Agent, enable it in your datadog.yaml file:
```
logs_enabled: true
```

Uncomment and edit the logs configuration block in your mapreduce.d/conf.yaml file. Change the type, path, and service parameter values based on your environment. See the sample mapreduce.d/conf.yaml for all available configuration options.

logs:
  - type: file
    path: <LOG_FILE_PATH>
    source: mapreduce
    service: <SERVICE_NAME>
    # To handle multi line that starts with yyyy-mm-dd use the following pattern
    # log_processing_rules:
    #   - type: multi_line
    #     pattern: \d{4}\-\d{2}\-\d{2} \d{2}:\d{2}:\d{2},\d{3}
    #     name: new_log_start_with_date

Restart the Agent.

Containerized

For containerized environments, see the Autodiscovery Integration Templates for guidance on applying the parameters below.

Parameter	Value
`<INTEGRATION_NAME>`	`mapreduce`
`<INIT_CONFIG>`	blank or `{}`
`<INSTANCE_CONFIG>`	`{"resourcemanager_uri": "https://%%host%%:8088", "cluster_name":"<MAPREDUCE_CLUSTER_NAME>"}`

Log collection

Collecting logs is disabled by default in the Datadog Agent. To enable it, see the Docker Log Collection.

Then, set log integrations as Docker labels:

LABEL "com.datadoghq.ad.logs"='[{"source": "mapreduce", "service": "<SERVICE_NAME>"}]'

Validation

Run the Agent’s status subcommand and look for mapreduce under the Checks section.

Data Collected

Metrics


mapreduce.job.counter.map_counter_value (rate)	Counter value of map tasks Shown as task
mapreduce.job.counter.reduce_counter_value (rate)	Counter value of reduce tasks Shown as task
mapreduce.job.counter.total_counter_value (rate)	Counter value of all tasks Shown as task
mapreduce.job.elapsed_time.95percentile (gauge)	95th percentile elapsed time since the application started Shown as millisecond
mapreduce.job.elapsed_time.avg (gauge)	Average elapsed time since the application started Shown as millisecond
mapreduce.job.elapsed_time.count (rate)	Number of times the elapsed time was sampled
mapreduce.job.elapsed_time.max (gauge)	Max elapsed time since the application started Shown as millisecond
mapreduce.job.elapsed_time.median (gauge)	Median elapsed time since the application started Shown as millisecond
mapreduce.job.failed_map_attempts (rate)	Number of failed map attempts Shown as task
mapreduce.job.failed_reduce_attempts (rate)	Number of failed reduce attempts Shown as task
mapreduce.job.killed_map_attempts (rate)	Number of killed map attempts Shown as task
mapreduce.job.killed_reduce_attempts (rate)	Number of killed reduce attempts Shown as task
mapreduce.job.map.task.elapsed_time.95percentile (gauge)	95th percentile of all map tasks elapsed time Shown as millisecond
mapreduce.job.map.task.elapsed_time.avg (gauge)	Average of all map tasks elapsed time Shown as millisecond
mapreduce.job.map.task.elapsed_time.count (rate)	Number of times the map tasks elapsed time were sampled
mapreduce.job.map.task.elapsed_time.max (gauge)	Max of all map tasks elapsed time Shown as millisecond
mapreduce.job.map.task.elapsed_time.median (gauge)	Median of all map tasks elapsed time Shown as millisecond
mapreduce.job.maps_completed (rate)	Number of completed maps Shown as task
mapreduce.job.maps_pending (rate)	Number of pending maps Shown as task
mapreduce.job.maps_running (rate)	Number of running maps Shown as task
mapreduce.job.maps_total (rate)	Total number of maps Shown as task
mapreduce.job.new_map_attempts (rate)	Number of new map attempts Shown as task
mapreduce.job.new_reduce_attempts (rate)	Number of new reduce attempts Shown as task
mapreduce.job.reduce.task.elapsed_time.95percentile (gauge)	95th percentile of all reduce tasks elapsed time Shown as millisecond
mapreduce.job.reduce.task.elapsed_time.avg (gauge)	Average of all reduce tasks elapsed time Shown as millisecond
mapreduce.job.reduce.task.elapsed_time.count (rate)	Number of times the reduce tasks elapsed time were sampled
mapreduce.job.reduce.task.elapsed_time.max (gauge)	Max of all reduce tasks elapsed time Shown as millisecond
mapreduce.job.reduce.task.elapsed_time.median (gauge)	Median of all reduce tasks elapsed time Shown as millisecond
mapreduce.job.reduces_completed (rate)	Number of completed reduces Shown as task
mapreduce.job.reduces_pending (rate)	Number of pending reduces Shown as task
mapreduce.job.reduces_running (rate)	Number of running reduces Shown as task
mapreduce.job.reduces_total (rate)	Number of reduces Shown as task
mapreduce.job.running_map_attempts (rate)	Number of running map attempts Shown as task
mapreduce.job.running_reduce_attempts (rate)	Number of running reduce attempts Shown as task
mapreduce.job.successful_map_attempts (rate)	Number of successful map attempts Shown as task
mapreduce.job.successful_reduce_attempts (rate)	Number of successful reduce attempts Shown as task

Events

The Mapreduce check does not include any events.

Service Checks

mapreduce.resource_manager.can_connect

Returns CRITICAL if the Agent is unable to connect to the Resource Manager. Returns OK otherwise.

Statuses: ok, critical

mapreduce.application_master.can_connect

Returns CRITICAL if the Agent is unable to connect to the Application Master. Returns OK otherwise.

Statuses: ok, critical

Troubleshooting

Need help? Contact Datadog support.

Map Reduce

Overview

Setup

Installation

Configuration

Host

Log collection

Containerized

Log collection

Validation

Data Collected

Metrics

Events

Service Checks

Troubleshooting

Further Reading