Map Reduce

Supported OS Linux Mac OS Windows

Intégration3.1.0

Dashboard MapReduce

Présentation

Recueillez des métriques du service MapReduce en temps réel pour :

  • Visualiser et surveiller les statuts de MapReduce
  • Être informé des failovers et des événements de MapReduce

Configuration

Installation

Le check MapReduce est inclus avec le package de l’Agent Datadog : vous n’avez donc rien d’autre à installer sur vos serveurs.

Configuration

Host

Pour configurer ce check lorsque l’Agent est exécuté sur un host :

  1. Modifiez le fichier mapreduce.d/conf.yaml dans le dossier conf.d/ à la racine du répertoire de configuration de votre Agent afin de spécifier votre serveur et votre port et de définir les masters à surveiller. Consultez le fichier d’exemple mapreduce.d/conf.yaml pour découvrir toutes les options de configuration disponibles.

  2. Redémarrez l’Agent.

Collecte de logs
  1. La collecte de logs est désactivée par défaut dans l’Agent Datadog. Vous devez l’activer dans datadog.yaml :

    logs_enabled: true
    
  2. Supprimez la mise en commentaire du bloc de configuration des logs du fichier mapreduce.d/conf.yaml et modifiez les paramètres. Modifiez les valeurs des paramètres type, path et service en fonction de votre environnement. Consultez le fichier d’exemple mapreduce.d/conf.yaml pour découvrir toutes les options de configuration disponibles.

    logs:
      - type: file
        path: <LOG_FILE_PATH>
        source: mapreduce
        service: <SERVICE_NAME>
        # To handle multi line that starts with yyyy-mm-dd use the following pattern
        # log_processing_rules:
        #   - type: multi_line
        #     pattern: \d{4}\-\d{2}\-\d{2} \d{2}:\d{2}:\d{2},\d{3}
        #     name: new_log_start_with_date
    
  3. Redémarrez l’Agent.

Environnement conteneurisé

Consultez la documentation relative aux modèles d’intégration Autodiscovery pour découvrir comment appliquer les paramètres ci-dessous à un environnement conteneurisé.

ParamètreValeur
<NOM_INTÉGRATION>mapreduce
<CONFIG_INIT>vide ou {}
<CONFIG_INSTANCE>{"resourcemanager_uri": "https://%%host%%:8088", "cluster_name":"<NOM_CLUSTER_MAPREDUCE>"}
Collecte de logs

La collecte des logs est désactivée par défaut dans l’Agent Datadog. Pour l’activer, consultez la section Collecte de logs avec Docker.

Définissez ensuite des intégrations de logs en tant qu’étiquettes Docker :

LABEL "com.datadoghq.ad.logs"='[{"source": "mapreduce", "service": "<NOM_SERVICE>"}]'

Validation

Lancez la sous-commande status de l’Agent et cherchez mapreduce dans la section Checks.

Données collectées

Métriques

mapreduce.job.counter.map_counter_value
(rate)
Counter value of map tasks
Shown as task
mapreduce.job.counter.reduce_counter_value
(rate)
Counter value of reduce tasks
Shown as task
mapreduce.job.counter.total_counter_value
(rate)
Counter value of all tasks
Shown as task
mapreduce.job.elapsed_time.95percentile
(gauge)
95th percentile elapsed time since the application started
Shown as millisecond
mapreduce.job.elapsed_time.avg
(gauge)
Average elapsed time since the application started
Shown as millisecond
mapreduce.job.elapsed_time.count
(rate)
Number of times the elapsed time was sampled
mapreduce.job.elapsed_time.max
(gauge)
Max elapsed time since the application started
Shown as millisecond
mapreduce.job.elapsed_time.median
(gauge)
Median elapsed time since the application started
Shown as millisecond
mapreduce.job.failed_map_attempts
(rate)
Number of failed map attempts
Shown as task
mapreduce.job.failed_reduce_attempts
(rate)
Number of failed reduce attempts
Shown as task
mapreduce.job.killed_map_attempts
(rate)
Number of killed map attempts
Shown as task
mapreduce.job.killed_reduce_attempts
(rate)
Number of killed reduce attempts
Shown as task
mapreduce.job.map.task.elapsed_time.95percentile
(gauge)
95th percentile of all map tasks elapsed time
Shown as millisecond
mapreduce.job.map.task.elapsed_time.avg
(gauge)
Average of all map tasks elapsed time
Shown as millisecond
mapreduce.job.map.task.elapsed_time.count
(rate)
Number of times the map tasks elapsed time were sampled
mapreduce.job.map.task.elapsed_time.max
(gauge)
Max of all map tasks elapsed time
Shown as millisecond
mapreduce.job.map.task.elapsed_time.median
(gauge)
Median of all map tasks elapsed time
Shown as millisecond
mapreduce.job.maps_completed
(rate)
Number of completed maps
Shown as task
mapreduce.job.maps_pending
(rate)
Number of pending maps
Shown as task
mapreduce.job.maps_running
(rate)
Number of running maps
Shown as task
mapreduce.job.maps_total
(rate)
Total number of maps
Shown as task
mapreduce.job.new_map_attempts
(rate)
Number of new map attempts
Shown as task
mapreduce.job.new_reduce_attempts
(rate)
Number of new reduce attempts
Shown as task
mapreduce.job.reduce.task.elapsed_time.95percentile
(gauge)
95th percentile of all reduce tasks elapsed time
Shown as millisecond
mapreduce.job.reduce.task.elapsed_time.avg
(gauge)
Average of all reduce tasks elapsed time
Shown as millisecond
mapreduce.job.reduce.task.elapsed_time.count
(rate)
Number of times the reduce tasks elapsed time were sampled
mapreduce.job.reduce.task.elapsed_time.max
(gauge)
Max of all reduce tasks elapsed time
Shown as millisecond
mapreduce.job.reduce.task.elapsed_time.median
(gauge)
Median of all reduce tasks elapsed time
Shown as millisecond
mapreduce.job.reduces_completed
(rate)
Number of completed reduces
Shown as task
mapreduce.job.reduces_pending
(rate)
Number of pending reduces
Shown as task
mapreduce.job.reduces_running
(rate)
Number of running reduces
Shown as task
mapreduce.job.reduces_total
(rate)
Number of reduces
Shown as task
mapreduce.job.running_map_attempts
(rate)
Number of running map attempts
Shown as task
mapreduce.job.running_reduce_attempts
(rate)
Number of running reduce attempts
Shown as task
mapreduce.job.successful_map_attempts
(rate)
Number of successful map attempts
Shown as task
mapreduce.job.successful_reduce_attempts
(rate)
Number of successful reduce attempts
Shown as task

Événements

Le check Mapreduce n’inclut aucun événement.

Checks de service

mapreduce.resource_manager.can_connect
Renvoie CRITICAL si l’Agent n’est pas capable de se connecter à Resource Manager. Si ce n’est pas le cas, renvoie OK.
Statuses: ok, critical

mapreduce.application_master.can_connect
Renvoie CRITICAL si l’Agent n’est pas capable de se connecter à Application Master. Si ce n’est pas le cas, renvoie OK.
Statuses: ok, critical

Dépannage

Besoin d’aide ? Contactez l’assistance Datadog.

Pour aller plus loin