Map Reduce

Supported OS Linux Mac OS Windows

Intégration3.1.0

Dashboard MapReduce

Présentation

Recueillez des métriques du service MapReduce en temps réel pour :

  • Visualiser et surveiller les statuts de MapReduce
  • Être informé des failovers et des événements de MapReduce

Configuration

Installation

Le check MapReduce est inclus avec le package de l’Agent Datadog  : vous n’avez donc rien d’autre à installer sur vos serveurs.

Configuration

Host

Pour configurer ce check lorsque l’Agent est exécuté sur un host :

  1. Modifiez le fichier mapreduce.d/conf.yaml dans le dossier conf.d/ à la racine du répertoire de configuration de votre Agent afin de spécifier votre serveur et votre port et de définir les masters à surveiller. Consultez le fichier d’exemple mapreduce.d/conf.yaml pour découvrir toutes les options de configuration disponibles.

  2. Redémarrez l’Agent .

Collecte de logs
  1. La collecte de logs est désactivée par défaut dans l’Agent Datadog. Vous devez l’activer dans datadog.yaml :

    logs_enabled: true
    
  2. Supprimez la mise en commentaire du bloc de configuration des logs du fichier mapreduce.d/conf.yaml et modifiez les paramètres. Modifiez les valeurs des paramètres type, path et service en fonction de votre environnement. Consultez le fichier d’exemple mapreduce.d/conf.yaml pour découvrir toutes les options de configuration disponibles.

    logs:
      - type: file
        path: <LOG_FILE_PATH>
        source: mapreduce
        service: <SERVICE_NAME>
        # To handle multi line that starts with yyyy-mm-dd use the following pattern
        # log_processing_rules:
        #   - type: multi_line
        #     pattern: \d{4}\-\d{2}\-\d{2} \d{2}:\d{2}:\d{2},\d{3}
        #     name: new_log_start_with_date
    
  3. Redémarrez l’Agent .

Environnement conteneurisé

Consultez la documentation relative aux modèles d’intégration Autodiscovery pour découvrir comment appliquer les paramètres ci-dessous à un environnement conteneurisé.

ParamètreValeur
<NOM_INTÉGRATION>mapreduce
<CONFIG_INIT>vide ou {}
<CONFIG_INSTANCE>{"resourcemanager_uri": "https://%%host%%:8088", "cluster_name":"<NOM_CLUSTER_MAPREDUCE>"}
Collecte de logs

La collecte des logs est désactivée par défaut dans l’Agent Datadog. Pour l’activer, consultez la section Collecte de logs avec Docker .

Définissez ensuite des intégrations de logs en tant qu’étiquettes Docker :

LABEL "com.datadoghq.ad.logs"='[{"source": "mapreduce", "service": "<NOM_SERVICE>"}]'

Validation

Lancez la sous-commande status de l’Agent et cherchez mapreduce dans la section Checks.

Données collectées

Métriques

mapreduce.job.elapsed_time.max
(gauge)
Max elapsed time since the application started
Shown as millisecond
mapreduce.job.elapsed_time.avg
(gauge)
Average elapsed time since the application started
Shown as millisecond
mapreduce.job.elapsed_time.median
(gauge)
Median elapsed time since the application started
Shown as millisecond
mapreduce.job.elapsed_time.95percentile
(gauge)
95th percentile elapsed time since the application started
Shown as millisecond
mapreduce.job.elapsed_time.count
(rate)
Number of times the elapsed time was sampled
mapreduce.job.maps_total
(rate)
Total number of maps
Shown as task
mapreduce.job.maps_completed
(rate)
Number of completed maps
Shown as task
mapreduce.job.reduces_total
(rate)
Number of reduces
Shown as task
mapreduce.job.reduces_completed
(rate)
Number of completed reduces
Shown as task
mapreduce.job.maps_pending
(rate)
Number of pending maps
Shown as task
mapreduce.job.maps_running
(rate)
Number of running maps
Shown as task
mapreduce.job.reduces_pending
(rate)
Number of pending reduces
Shown as task
mapreduce.job.reduces_running
(rate)
Number of running reduces
Shown as task
mapreduce.job.new_reduce_attempts
(rate)
Number of new reduce attempts
Shown as task
mapreduce.job.running_reduce_attempts
(rate)
Number of running reduce attempts
Shown as task
mapreduce.job.failed_reduce_attempts
(rate)
Number of failed reduce attempts
Shown as task
mapreduce.job.killed_reduce_attempts
(rate)
Number of killed reduce attempts
Shown as task
mapreduce.job.successful_reduce_attempts
(rate)
Number of successful reduce attempts
Shown as task
mapreduce.job.new_map_attempts
(rate)
Number of new map attempts
Shown as task
mapreduce.job.running_map_attempts
(rate)
Number of running map attempts
Shown as task
mapreduce.job.failed_map_attempts
(rate)
Number of failed map attempts
Shown as task
mapreduce.job.killed_map_attempts
(rate)
Number of killed map attempts
Shown as task
mapreduce.job.successful_map_attempts
(rate)
Number of successful map attempts
Shown as task
mapreduce.job.counter.reduce_counter_value
(rate)
Counter value of reduce tasks
Shown as task
mapreduce.job.counter.map_counter_value
(rate)
Counter value of map tasks
Shown as task
mapreduce.job.counter.total_counter_value
(rate)
Counter value of all tasks
Shown as task
mapreduce.job.map.task.elapsed_time.max
(gauge)
Max of all map tasks elapsed time
Shown as millisecond
mapreduce.job.map.task.elapsed_time.avg
(gauge)
Average of all map tasks elapsed time
Shown as millisecond
mapreduce.job.map.task.elapsed_time.median
(gauge)
Median of all map tasks elapsed time
Shown as millisecond
mapreduce.job.map.task.elapsed_time.95percentile
(gauge)
95th percentile of all map tasks elapsed time
Shown as millisecond
mapreduce.job.map.task.elapsed_time.count
(rate)
Number of times the map tasks elapsed time were sampled
mapreduce.job.reduce.task.elapsed_time.max
(gauge)
Max of all reduce tasks elapsed time
Shown as millisecond
mapreduce.job.reduce.task.elapsed_time.avg
(gauge)
Average of all reduce tasks elapsed time
Shown as millisecond
mapreduce.job.reduce.task.elapsed_time.median
(gauge)
Median of all reduce tasks elapsed time
Shown as millisecond
mapreduce.job.reduce.task.elapsed_time.95percentile
(gauge)
95th percentile of all reduce tasks elapsed time
Shown as millisecond
mapreduce.job.reduce.task.elapsed_time.count
(rate)
Number of times the reduce tasks elapsed time were sampled

Événements

Le check Mapreduce n’inclut aucun événement.

Checks de service

mapreduce.resource_manager.can_connect
Renvoie CRITICAL si l’Agent n’est pas capable de se connecter à Resource Manager. Si ce n’est pas le cas, renvoie OK.
Statuses: ok, critical

mapreduce.application_master.can_connect
Renvoie CRITICAL si l’Agent n’est pas capable de se connecter à Application Master. Si ce n’est pas le cas, renvoie OK.
Statuses: ok, critical

Dépannage

Besoin d’aide ? Contactez l’assistance Datadog .

Pour aller plus loin