Hbase Master

Supported OS Linux Mac OS Windows

Integrationv1.1.0

Présentation

Recueillez des métriques du service Hbase_master en temps réel pour :

  • Visualiser et surveiller les états de Hbase_master
  • Être informé des failovers et des événements de Hbase_master

Configuration

Le check Hbase_master n’est pas inclus avec le package de l’Agent Datadog : vous devez donc l’installer.

Installation

Pour l’Agent v7.21+/6.21+, suivez les instructions ci-dessous afin d’installer le check Hbase_master sur votre host. Consultez la section Utiliser les intégrations de la communauté pour effectuer une installation avec l’Agent Docker ou avec des versions antérieures de l’Agent.

  1. Exécutez la commande suivante pour installer l’intégration de l’Agent :

    datadog-agent integration install -t datadog-hbase_master==<INTEGRATION_VERSION>
    
  2. Configurez votre intégration comme une intégration de base.

Configuration

  1. Modifiez le fichier hbase_master.d/conf.yaml dans le dossier conf.d/ à la racine du répertoire de configuration de votre Agent pour commencer à recueillir vos métriques Hbase_master. Consultez le fichier d’exemple hbase_master.d/conf.yaml pour découvrir toutes les options de configuration disponibles.

    REMARQUE : si vous utilisez l’Agent v6, assurez-vous de bien modifier le fichier hbase_master.d/metrics.yaml et de placer les clés booléennes entre guillemets.

      - include:
          domain: Hadoop
          bean:
            - Hadoop:service=HBase,name=Master,sub=Server
          attribute:
            # Is Active Master
            tag.isActiveMaster:
               metric_type: gauge
               alias: hbase.master.server.tag.is_active_master
               values: {"true": 1, "false": 0, default: 0}
    
  2. Redémarrez l’Agent.

Collecte de logs

  1. La collecte de logs est désactivée par défaut dans l’Agent Datadog. Vous devez l’activer dans datadog.yaml :

    logs_enabled: true
    
  2. Ajoutez ce bloc de configuration à votre fichier hbase_master.d/conf.yaml pour commencer à recueillir vos logs Hbase_master :

    logs:
      - type: file
        path: /path/to/my/directory/file.log
        source: hbase
    

    Modifiez la valeur du paramètre path et configurez-la pour votre environnement. Consultez le fichier d’exemple hbase_master.d/conf.yaml pour découvrir toutes les options de configuration disponibles.

  3. Redémarrez l’Agent.

Validation

Lancez la sous-commande status de l’Agent et cherchez hbase_master dans la section Checks.

Données collectées

Métriques

hbase.master.assignmentmanager.rit_oldest_age
(gauge)
The age of the longest region in transition, in milliseconds
Shown as millisecond
hbase.master.assignmentmanager.rit_count_over_threshold
(gauge)
The number of regions that have been in transition longer than a threshold time
hbase.master.assignmentmanager.rit_count
(gauge)
The number of regions in transition
hbase.master.assignmentmanager.assign.min
(gauge)
hbase.master.assignmentmanager.assign.max
(gauge)
hbase.master.assignmentmanager.assign.mean
(gauge)
hbase.master.assignmentmanager.assign.median
(gauge)
hbase.master.assignmentmanager.assign.percentile.99
(gauge)
hbase.master.ipc.queue_size
(gauge)
Number of bytes in the call queues.
Shown as byte
hbase.master.ipc.num_calls_in_general_queue
(gauge)
Number of calls in the general call queue.
hbase.master.ipc.num_calls_in_replication_queue
(gauge)
Number of calls in the replication call queue.
hbase.master.ipc.num_calls_in_priority_queue
(gauge)
Number of calls in the priority call queue.
hbase.master.ipc.num_open_connections
(gauge)
Number of open connections.
hbase.master.ipc.num_active_handler
(gauge)
Number of active rpc handlers.
hbase.master.ipc.total_call_time.max
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.master.ipc.total_call_time.mean
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.master.ipc.total_call_time.median
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.master.ipc.total_call_time.percentile.99
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.master.server.tag.is_active_master
(gauge)
Is Active Master
hbase.master.server.num_region_servers
(gauge)
Number of RegionServers
hbase.master.server.num_dead_region_servers
(gauge)
Number of dead RegionServers

Événements

Le check Hbase_master n’inclut aucun événement.

Checks de service

Le check Hbase_master n’inclut aucun check de service.

Dépannage

Besoin d’aide ? Contactez l’assistance Datadog.

Intégration HBase RegionServer

Présentation

Recueillez des métriques du service HBase RegionServer en temps réel pour :

  • Visualiser et surveiller les états de HBase RegionServer
  • Être informé des failovers et des événements de HBase RegionServer

Configuration

Le check HBase RegionServer n’est pas inclus avec le package de l’Agent Datadog : vous devez donc l’installer.

Installation

Pour l’Agent v7.21+/6.21+, suivez les instructions ci-dessous afin d’installer le check HBase RegionServer sur votre host. Consultez la section Utiliser les intégrations de la communauté pour effectuer une installation avec l’Agent Docker ou avec des versions antérieures de l’Agent.

  1. Exécutez la commande suivante pour installer l’intégration de l’Agent :

    datadog-agent integration install -t datadog-hbase_regionserver==<INTEGRATION_VERSION>
    
  2. Configurez votre intégration comme une intégration de base.

Configuration

  1. Modifiez le fichier hbase_regionserver.d/conf.yaml dans le dossier conf.d/ à la racine du répertoire de configuration de votre Agent pour commencer à recueillir vos métriques HBase RegionServer. Consultez le fichier d’exemple hbase_regionserver.d/conf.yaml pour découvrir toutes les options de configuration disponibles.

  2. Redémarrez l’Agent.

Collecte de logs

  1. La collecte de logs est désactivée par défaut dans l’Agent Datadog. Vous devez l’activer dans datadog.yaml :

    logs_enabled: true
    
  2. Ajoutez ce bloc de configuration à votre fichier hbase_regionserver.d/conf.yaml pour commencer à recueillir vos logs Hbase_regionserver :

    logs:
      - type: file
        path: /path/to/my/directory/file.log
        source: hbase
    

    Modifiez la valeur du paramètre path et configurez-la pour votre environnement. Consultez le [fichier d’exemple hbase_regionserver.d/conf.yaml10 pour découvrir toutes les options de configuration disponibles.

  3. Redémarrez l’Agent.

Validation

Lancez la sous-commande status de l’Agent et cherchez hbase_regionserver dans la section Checks.

Données collectées

Métriques

hbase.regionserver.ipc.queue_size
(gauge)
Number of bytes in the call queues.
Shown as byte
hbase.regionserver.ipc.num_open_connections
(gauge)
Number of open connections.
hbase.regionserver.ipc.num_active_handler
(gauge)
Number of active rpc handlers.
hbase.regionserver.ipc.total_call_time.max
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.regionserver.ipc.total_call_time.mean
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.regionserver.ipc.total_call_time.median
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.regionserver.ipc.total_call_time.percentile.99
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.regionserver.regions.num_regions
(gauge)
Number of regions in the metrics system
hbase.regionserver.replication.sink.applied_ops
(gauge)
Number of WAL entries applied on replication sink.
hbase.regionserver.replication.sink.age_of_last_applied_op
(gauge)
Replication time lag of last applied WAL entry between source and sink.
Shown as millisecond
hbase.regionserver.replication.sink.applied_batches
(gauge)
Number of WAL applying operations processd on replication sink.
hbase.regionserver.server.region_count
(gauge)
Number of regions
hbase.regionserver.server.store_count
(gauge)
Number of Stores
hbase.regionserver.server.hlog_file_count
(gauge)
Number of WAL Files
hbase.regionserver.server.hlog_file_size
(gauge)
Size of all WAL Files
Shown as byte
hbase.regionserver.server.store_file_count
(gauge)
Number of Store Files
hbase.regionserver.server.mem_store_size
(gauge)
Size of the memstore
Shown as byte
hbase.regionserver.server.store_file_size
(gauge)
Size of storefiles being served.
Shown as byte
hbase.regionserver.server.total_request_count
(gauge)
Total number of requests this RegionServer has answered.
hbase.regionserver.server.read_request_count
(gauge)
Number of read requests this region server has answered.
hbase.regionserver.server.write_request_count
(gauge)
Number of mutation requests this region server has answered.
hbase.regionserver.server.check_mutate_failed_count
(gauge)
Number of Check and Mutate calls that failed the checks.
hbase.regionserver.server.check_mutate_passed_count
(gauge)
Number of Check and Mutate calls that passed the checks.
hbase.regionserver.server.store_file_index_size
(gauge)
Size of indexes in storefiles on disk.
Shown as byte
hbase.regionserver.server.static_index_size
(gauge)
Uncompressed size of the static indexes.
Shown as byte
hbase.regionserver.server.static_bloom_size
(gauge)
Uncompressed size of the static bloom filters.
Shown as byte
hbase.regionserver.server.mutations_without_wal_count
(count)
Number of mutations that have been sent by clients with the write ahead logging turned off.
hbase.regionserver.server.mutations_without_wal_size
(gauge)
Size of data that has been sent by clients with the write ahead logging turned off.
Shown as byte
hbase.regionserver.server.percent_files_local
(gauge)
The percent of HFiles that are stored on the local hdfs data node.
Shown as percent
hbase.regionserver.server.percent_files_local_secondary_regions
(gauge)
The percent of HFiles used by secondary regions that are stored on the local hdfs data node.
Shown as percent
hbase.regionserver.server.split_queue_length
(gauge)
Length of the queue for splits.
hbase.regionserver.server.compaction_queue_length
(gauge)
Length of the queue for compactions.
hbase.regionserver.server.flush_queue_length
(gauge)
Length of the queue for region flushes
hbase.regionserver.server.block_cache_free_size
(gauge)
Size of the block cache that is not occupied.
Shown as byte
hbase.regionserver.server.block_cache_count
(gauge)
Number of block in the block cache.
hbase.regionserver.server.block_cache_size
(gauge)
Size of the block cache.
Shown as byte
hbase.regionserver.server.block_cache_hit_count
(gauge)
Count of the hit on the block cache.
hbase.regionserver.server.block_cache_hit_count_primary
(gauge)
Count of hit on primary replica in the block cache.
hbase.regionserver.server.block_cache_miss_count
(gauge)
Number of requests for a block that missed the block cache.
hbase.regionserver.server.block_cache_miss_count_primary
(gauge)
Number of requests for a block of primary replica that missed the block cache.
hbase.regionserver.server.block_cache_eviction_count
(gauge)
Count of the number of blocks evicted from the block cache.
hbase.regionserver.server.block_cache_eviction_count_primary
(gauge)
Count of the number of blocks evicted from primary replica in the block cache.
hbase.regionserver.server.block_cache_hit_percent
(gauge)
Percent of block cache requests that are hits
Shown as percent
hbase.regionserver.server.block_cache_express_hit_percent
(gauge)
The percent of the time that requests with the cache turned on hit the cache.
Shown as percent
hbase.regionserver.server.block_cache_failed_insertion_count
(gauge)
Number of times that a block cache insertion failed. Usually due to size restrictions.
Shown as millisecond
hbase.regionserver.server.updates_blocked_time
(gauge)
Number of MS updates have been blocked so that the memstore can be flushed.
Shown as millisecond
hbase.regionserver.server.flushed_cells_count
(gauge)
The number of cells flushed to disk
hbase.regionserver.server.compacted_cells_count
(gauge)
The number of cells processed during minor compactions
hbase.regionserver.server.major_compacted_cells_count
(gauge)
The number of cells processed during major compactions
hbase.regionserver.server.flushed_cells_size
(gauge)
The total amount of data flushed to disk, in bytes
Shown as byte
hbase.regionserver.server.compacted_cells_size
(gauge)
The total amount of data processed during minor compactions, in bytes
Shown as byte
hbase.regionserver.server.major_compacted_cells_size
(gauge)
The total amount of data processed during major compactions, in bytes
Shown as byte
hbase.regionserver.server.blocked_request_count
(gauge)
The number of blocked requests because of memstore size is larger than blockingMemStoreSize
hbase.regionserver.server.hedged_read
(gauge)
hbase.regionserver.server.hedged_read_wins
(gauge)
hbase.regionserver.server.pause_time_with_gc_num_ops
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_with_gc.min
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_with_gc.max
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_with_gc.mean
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_with_gc.median
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_with_gc.percentile.99
(gauge)

Shown as millisecond
hbase.regionserver.server.mutate.num_ops
(gauge)
hbase.regionserver.server.mutate.min
(gauge)
hbase.regionserver.server.mutate.max
(gauge)
hbase.regionserver.server.mutate.mean
(gauge)
hbase.regionserver.server.mutate.median
(gauge)
hbase.regionserver.server.mutate.percentile.99
(gauge)
hbase.regionserver.server.slow_append_count
(gauge)
The number of Appends that took over 1000ms to complete
hbase.regionserver.server.pause_warn_threshold_exceeded
(gauge)
hbase.regionserver.server.slow_delete_count
(gauge)
The number of Deletes that took over 1000ms to complete
hbase.regionserver.server.increment.num_ops
(gauge)
hbase.regionserver.server.increment.min
(gauge)
hbase.regionserver.server.increment.max
(gauge)
hbase.regionserver.server.increment.mean
(gauge)
hbase.regionserver.server.increment.median
(gauge)
hbase.regionserver.server.increment.percentile.99
(gauge)
hbase.regionserver.server.replay.num_ops
(gauge)
hbase.regionserver.server.replay.min
(gauge)
hbase.regionserver.server.replay.max
(gauge)
hbase.regionserver.server.replay.mean
(gauge)
hbase.regionserver.server.replay.median
(gauge)
hbase.regionserver.server.replay.percentile.99
(gauge)
hbase.regionserver.server.flush_time.num_ops
(gauge)

Shown as millisecond
hbase.regionserver.server.flush_time.min
(gauge)

Shown as millisecond
hbase.regionserver.server.flush_time.max
(gauge)

Shown as millisecond
hbase.regionserver.server.flush_time.mean
(gauge)

Shown as millisecond
hbase.regionserver.server.flush_time.median
(gauge)

Shown as millisecond
hbase.regionserver.server.flush_time.percentile.99
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_info_threshold_exceeded
(gauge)
hbase.regionserver.server.delete.num_ops
(gauge)
hbase.regionserver.server.delete.min
(gauge)
hbase.regionserver.server.delete.max
(gauge)
hbase.regionserver.server.delete.mean
(gauge)
hbase.regionserver.server.delete.median
(gauge)
hbase.regionserver.server.delete.percentile.99
(gauge)
hbase.regionserver.server.split_request_count
(gauge)
Number of splits requested
hbase.regionserver.server.split_success_count
(gauge)
Number of successfully executed splits
hbase.regionserver.server.slow_get_count
(gauge)
The number of Gets that took over 1000ms to complete
hbase.regionserver.server.get.num_ops
(gauge)
hbase.regionserver.server.get.min
(gauge)
hbase.regionserver.server.get.max
(gauge)
hbase.regionserver.server.get.mean
(gauge)
hbase.regionserver.server.get.median
(gauge)
hbase.regionserver.server.get.percentile.99
(gauge)
hbase.regionserver.server.scan_next.num_ops
(gauge)
hbase.regionserver.server.scan_next.min
(gauge)
hbase.regionserver.server.scan_next.max
(gauge)
hbase.regionserver.server.scan_next.mean
(gauge)
hbase.regionserver.server.scan_next.median
(gauge)
hbase.regionserver.server.scan_next.percentile.99
(gauge)
hbase.regionserver.server.pause_time_without_gc.num_ops
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_without_gc.min
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_without_gc.max
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_without_gc.mean
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_without_gc.median
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_without_gc.percentile.99
(gauge)

Shown as millisecond
hbase.regionserver.server.slow_put_count
(gauge)
The number of Multis that took over 1000ms to complete
hbase.regionserver.server.slow_increment_count
(gauge)
The number of Increments that took over 1000ms to complete
hbase.regionserver.server.split_time.num_ops
(gauge)

Shown as millisecond
hbase.regionserver.server.split_time.min
(gauge)

Shown as millisecond
hbase.regionserver.server.split_time.max
(gauge)

Shown as millisecond
hbase.regionserver.server.split_time.mean
(gauge)

Shown as millisecond
hbase.regionserver.server.split_time.median
(gauge)

Shown as millisecond
hbase.regionserver.server.split_time.percentile.99
(gauge)

Shown as millisecond
hbase.regionserver.wal.append_size.num_ops
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.append_size.min
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.append_size.max
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.append_size.mean
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.append_size.median
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.append_size.percentile.99
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.sync_time.num_ops
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.sync_time.min
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.sync_time.max
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.sync_time.mean
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.sync_time.median
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.sync_time.percentile.99
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.slow_append_count
(gauge)
Number of appends that were slow.
hbase.regionserver.wal.roll_request
(gauge)
How many times a log roll has been requested total
Shown as millisecond
hbase.regionserver.wal.append_count
(gauge)
Number of appends to the write ahead log.
hbase.regionserver.wal.low_replica_roll_request
(gauge)
How many times a log roll was requested due to too few DN's in the write pipeline.
Shown as millisecond
hbase.regionserver.wal.append_time.num_ops
(gauge)
time an append to the log took.
Shown as millisecond
hbase.regionserver.wal.append_time.min
(gauge)
time an append to the log took.
Shown as millisecond
hbase.regionserver.wal.append_time.max
(gauge)
time an append to the log took.
Shown as millisecond
hbase.regionserver.wal.append_time.mean
(gauge)
time an append to the log took.
Shown as millisecond
hbase.regionserver.wal.append_time.median
(gauge)
time an append to the log took.
Shown as millisecond
hbase.regionserver.wal.append_time.percentile.99
(gauge)
time an append to the log took.
Shown as millisecond
hbase.jvm_metrics.mem_non_heap_used_in_mb
(gauge)
Non-heap memory used in MB
hbase.jvm_metrics.mem_non_heap_committed_in_mb
(gauge)
Non-heap memory committed in MB
hbase.jvm_metrics.mem_non_heap_max_in_mb
(gauge)
Non-heap memory max in MB
hbase.jvm_metrics.mem_heap_used_in_mb
(gauge)
Heap memory used in MB
hbase.jvm_metrics.mem_heap_committed_in_mb
(gauge)
Heap memory committed in MB
hbase.jvm_metrics.mem_heap_max_in_mb
(gauge)
Heap memory max in MB
hbase.jvm_metrics.mem_max_in_mb
(gauge)
Max memory size in MB
hbase.jvm_metrics.gc_count_par_new
(gauge)
GC Count for ParNew
hbase.jvm_metrics.gc_time_millis_par_new
(gauge)
GC Time for ParNew
Shown as millisecond
hbase.jvm_metrics.gc_count_concurrent_mark_sweep
(gauge)
GC Count for ConcurrentMarkSweep
hbase.jvm_metrics.gc_time_millis_concurrent_mark_sweep
(gauge)
GC Time for ConcurrentMarkSweep
Shown as millisecond
hbase.jvm_metrics.gc_count
(gauge)
Total GC count
hbase.jvm_metrics.gc_time_millis
(gauge)
Total GC time in milliseconds
Shown as millisecond

Événements

Le check HBase RegionServer n’inclut aucun événement.

Checks de service

Le check HBase RegionServer n’inclut aucun check de service.

Dépannage

Besoin d’aide ? Contactez l’assistance Datadog.