Hbase Master
Rapport de recherche Datadog : Bilan sur l'adoption de l'informatique sans serveur Rapport : Bilan sur l'adoption de l'informatique sans serveur

Hbase Master

Agent Check Check de l'Agent

Supported OS: Linux Mac OS Windows

Présentation

Recueillez des métriques du service Hbase_master en temps réel pour :

  • Visualiser et surveiller les états de Hbase_master
  • Être informé des failovers et des événements de Hbase_master

Implémentation

Installation

Si vous utilisez la version 6.8 ou ultérieure de l’Agent, suivez les instructions ci-dessous pour installer le check Hbase_master sur votre host. Consultez notre guide relatif à l’installation d’intégrations développées par la communauté pour installer des checks avec une version < 6.8 de l’Agent ou avec l’Agent Docker :

  1. Installez le kit de développement.
  2. Clonez le référentiel integrations-extras :

    git clone https://github.com/DataDog/integrations-extras.git.
  3. Mettez à jour votre configuration ddev avec le chemin integrations-extras/ :

    ddev config set extras ./integrations-extras
  4. Pour générer le paquet hbase_master, exécutez :

    ddev -e release build hbase_master
  5. Téléchargez et lancez l’Agent Datadog.

  6. Exécutez la commande suivante pour installer le wheel de l’intégration à l’aide de l’Agent :

    datadog-agent integration install -w <PATH_OF_HBASE_MASTER_ARTIFACT_>/<HBASE_MASTER_ARTIFACT_NAME>.whl
  7. Configurez votre intégration comme n’importe quelle autre intégration du paquet.

Configuration

  1. Modifiez le fichier hbase_master.d/conf.yaml dans le dossier conf.d/ à la racine du répertoire de configuration de votre Agent pour commencer à recueillir vos métriques Hbase_master. Consultez le fichier d’exemple hbase_master.d/conf.yaml pour découvrir toutes les options de configuration disponibles.

  2. Redémarrez l’Agent.

Validation

Lancez la sous-commande status de l’Agent et cherchez hbase_master dans la section Checks.

Données collectées

Metrics

hbase.master.assignmentmanager.rit_oldest_age
(gauge)
The age of the longest region in transition, in milliseconds
Shown as millisecond
hbase.master.assignmentmanager.rit_count_over_threshold
(gauge)
The number of regions that have been in transition longer than a threshold time
hbase.master.assignmentmanager.rit_count
(gauge)
The number of regions in transition
hbase.master.assignmentmanager.assign.min
(gauge)
hbase.master.assignmentmanager.assign.max
(gauge)
hbase.master.assignmentmanager.assign.mean
(gauge)
hbase.master.assignmentmanager.assign.median
(gauge)
hbase.master.assignmentmanager.assign.percentile.99
(gauge)
hbase.master.ipc.queue_size
(gauge)
Number of bytes in the call queues.
Shown as byte
hbase.master.ipc.num_calls_in_general_queue
(gauge)
Number of calls in the general call queue.
hbase.master.ipc.num_calls_in_replication_queue
(gauge)
Number of calls in the replication call queue.
hbase.master.ipc.num_calls_in_priority_queue
(gauge)
Number of calls in the priority call queue.
hbase.master.ipc.num_open_connections
(gauge)
Number of open connections.
hbase.master.ipc.num_active_handler
(gauge)
Number of active rpc handlers.
hbase.master.ipc.total_call_time.max
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.master.ipc.total_call_time.mean
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.master.ipc.total_call_time.median
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.master.ipc.total_call_time.percentile.99
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.master.server.tag.is_active_master
(gauge)
Is Active Master
hbase.master.server.num_region_servers
(gauge)
Number of RegionServers
hbase.master.server.num_dead_region_servers
(gauge)
Number of dead RegionServers

Événements

Le check Hbase_master n’inclut aucun événement.

Checks de service

Le check Hbase_master n’inclut aucun check de service.

Dépannage

Besoin d’aide ? Contactez l’assistance Datadog.

Intégration HBase RegionServer

Présentation

Recueillez des métriques du service HBase RegionServer en temps réel pour :

  • Visualiser et surveiller les états de HBase RegionServer
  • Être informé des failovers et des événements de HBase RegionServer

Configuration

Le check HBase RegionServer n’est PAS inclus dans le paquet de l’Agent Datadog.

Installation

Si vous utilisez la version 6.8 ou ultérieure de l’Agent, suivez les instructions ci-dessous pour installer le check HBase RegionServer sur votre host. Consultez notre guide relatif à l’installation d’intégrations développées par la communauté pour installer des checks avec une version < 6.8 de l’Agent ou avec l’Agent Docker :

  1. Installez le kit de développement.
  2. Clonez le référentiel integrations-extras :

    git clone https://github.com/DataDog/integrations-extras.git.
  3. Mettez à jour votre configuration ddev avec le chemin integrations-extras/ :

    ddev config set extras ./integrations-extras
  4. Pour générer le paquet hbase_regionserver, exécutez :

    ddev -e release build hbase_regionserver
  5. Téléchargez et lancez l’Agent Datadog.

  6. Exécutez la commande suivante pour installer le wheel de l’intégration à l’aide de l’Agent :

    datadog-agent integration install -w <PATH_OF_HBASE_REGIONSERVER_ARTIFACT_>/<HBASE_REGIONSERVER_ARTIFACT_NAME>.whl
  7. Configurez votre intégration comme n’importe quelle autre intégration du paquet.

Configuration

  1. Modifiez le fichier hbase_regionserver.d/conf.yaml dans le dossier conf.d/ à la racine du répertoire de configuration de votre Agent pour commencer à recueillir vos métriques HBase RegionServer. Consultez le fichier d’exemple hbase_regionserver.d/conf.yaml pour découvrir toutes les options de configuration disponibles.

  2. Redémarrez l’Agent.

Validation

Lancez la sous-commande status de l’Agent et cherchez hbase_regionserver dans la section Checks.

Données collectées

Metrics

hbase.regionserver.ipc.queue_size
(gauge)
Number of bytes in the call queues.
Shown as byte
hbase.regionserver.ipc.num_open_connections
(gauge)
Number of open connections.
hbase.regionserver.ipc.num_active_handler
(gauge)
Number of active rpc handlers.
hbase.regionserver.ipc.total_call_time.max
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.regionserver.ipc.total_call_time.mean
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.regionserver.ipc.total_call_time.median
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.regionserver.ipc.total_call_time.percentile.99
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.regionserver.regions.num_regions
(gauge)
Number of regions in the metrics system
hbase.regionserver.replication.sink.applied_ops
(gauge)
Number of WAL entries applied on replication sink.
hbase.regionserver.replication.sink.age_of_last_applied_op
(gauge)
Replication time lag of last applied WAL entry between source and sink.
Shown as millisecond
hbase.regionserver.replication.sink.applied_batches
(gauge)
Number of WAL applying operations processd on replication sink.
hbase.regionserver.server.region_count
(gauge)
Number of regions
hbase.regionserver.server.store_count
(gauge)
Number of Stores
hbase.regionserver.server.hlog_file_count
(gauge)
Number of WAL Files
hbase.regionserver.server.hlog_file_size
(gauge)
Size of all WAL Files
Shown as byte
hbase.regionserver.server.store_file_count
(gauge)
Number of Store Files
hbase.regionserver.server.mem_store_size
(gauge)
Size of the memstore
Shown as byte
hbase.regionserver.server.store_file_size
(gauge)
Size of storefiles being served.
Shown as byte
hbase.regionserver.server.total_request_count
(gauge)
Total number of requests this RegionServer has answered.
hbase.regionserver.server.read_request_count
(gauge)
Number of read requests this region server has answered.
hbase.regionserver.server.write_request_count
(gauge)
Number of mutation requests this region server has answered.
hbase.regionserver.server.check_mutate_failed_count
(gauge)
Number of Check and Mutate calls that failed the checks.
hbase.regionserver.server.check_mutate_passed_count
(gauge)
Number of Check and Mutate calls that passed the checks.
hbase.regionserver.server.store_file_index_size
(gauge)
Size of indexes in storefiles on disk.
Shown as byte
hbase.regionserver.server.static_index_size
(gauge)
Uncompressed size of the static indexes.
Shown as byte
hbase.regionserver.server.static_bloom_size
(gauge)
Uncompressed size of the static bloom filters.
Shown as byte
hbase.regionserver.server.mutations_without_wal_count
(count)
Number of mutations that have been sent by clients with the write ahead logging turned off.
hbase.regionserver.server.mutations_without_wal_size
(gauge)
Size of data that has been sent by clients with the write ahead logging turned off.
Shown as byte
hbase.regionserver.server.percent_files_local
(gauge)
The percent of HFiles that are stored on the local hdfs data node.
Shown as percent
hbase.regionserver.server.percent_files_local_secondary_regions
(gauge)
The percent of HFiles used by secondary regions that are stored on the local hdfs data node.
Shown as percent
hbase.regionserver.server.split_queue_length
(gauge)
Length of the queue for splits.
hbase.regionserver.server.compaction_queue_length
(gauge)
Length of the queue for compactions.
hbase.regionserver.server.flush_queue_length
(gauge)
Length of the queue for region flushes
hbase.regionserver.server.block_cache_free_size
(gauge)
Size of the block cache that is not occupied.
Shown as byte
hbase.regionserver.server.block_cache_count
(gauge)
Number of block in the block cache.
hbase.regionserver.server.block_cache_size
(gauge)
Size of the block cache.
Shown as byte
hbase.regionserver.server.block_cache_hit_count
(gauge)
Count of the hit on the block cache.
hbase.regionserver.server.block_cache_hit_count_primary
(gauge)
Count of hit on primary replica in the block cache.
hbase.regionserver.server.block_cache_miss_count
(gauge)
Number of requests for a block that missed the block cache.
hbase.regionserver.server.block_cache_miss_count_primary
(gauge)
Number of requests for a block of primary replica that missed the block cache.
hbase.regionserver.server.block_cache_eviction_count
(gauge)
Count of the number of blocks evicted from the block cache.
hbase.regionserver.server.block_cache_eviction_count_primary
(gauge)
Count of the number of blocks evicted from primary replica in the block cache.
hbase.regionserver.server.block_cache_hit_percent
(gauge)
Percent of block cache requests that are hits
Shown as percent
hbase.regionserver.server.block_cache_express_hit_percent
(gauge)
The percent of the time that requests with the cache turned on hit the cache.
Shown as percent
hbase.regionserver.server.block_cache_failed_insertion_count
(gauge)
Number of times that a block cache insertion failed. Usually due to size restrictions.
Shown as millisecond
hbase.regionserver.server.updates_blocked_time
(gauge)
Number of MS updates have been blocked so that the memstore can be flushed.
Shown as millisecond
hbase.regionserver.server.flushed_cells_count
(gauge)
The number of cells flushed to disk
hbase.regionserver.server.compacted_cells_count
(gauge)
The number of cells processed during minor compactions
hbase.regionserver.server.major_compacted_cells_count
(gauge)
The number of cells processed during major compactions
hbase.regionserver.server.flushed_cells_size
(gauge)
The total amount of data flushed to disk, in bytes
Shown as byte
hbase.regionserver.server.compacted_cells_size
(gauge)
The total amount of data processed during minor compactions, in bytes
Shown as byte
hbase.regionserver.server.major_compacted_cells_size
(gauge)
The total amount of data processed during major compactions, in bytes
Shown as byte
hbase.regionserver.server.blocked_request_count
(gauge)
The number of blocked requests because of memstore size is larger than blockingMemStoreSize
hbase.regionserver.server.hedged_read
(gauge)
hbase.regionserver.server.hedged_read_wins
(gauge)
hbase.regionserver.server.pause_time_with_gc_num_ops
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_with_gc.min
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_with_gc.max
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_with_gc.mean
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_with_gc.median
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_with_gc.percentile.99
(gauge)

Shown as millisecond
hbase.regionserver.server.mutate.num_ops
(gauge)
hbase.regionserver.server.mutate.min
(gauge)
hbase.regionserver.server.mutate.max
(gauge)
hbase.regionserver.server.mutate.mean
(gauge)
hbase.regionserver.server.mutate.median
(gauge)
hbase.regionserver.server.mutate.percentile.99
(gauge)
hbase.regionserver.server.slow_append_count
(gauge)
The number of Appends that took over 1000ms to complete
hbase.regionserver.server.pause_warn_threshold_exceeded
(gauge)
hbase.regionserver.server.slow_delete_count
(gauge)
The number of Deletes that took over 1000ms to complete
hbase.regionserver.server.increment.num_ops
(gauge)
hbase.regionserver.server.increment.min
(gauge)
hbase.regionserver.server.increment.max
(gauge)
hbase.regionserver.server.increment.mean
(gauge)
hbase.regionserver.server.increment.median
(gauge)
hbase.regionserver.server.increment.percentile.99
(gauge)
hbase.regionserver.server.replay.num_ops
(gauge)
hbase.regionserver.server.replay.min
(gauge)
hbase.regionserver.server.replay.max
(gauge)
hbase.regionserver.server.replay.mean
(gauge)
hbase.regionserver.server.replay.median
(gauge)
hbase.regionserver.server.replay.percentile.99
(gauge)
hbase.regionserver.server.flush_time.num_ops
(gauge)

Shown as millisecond
hbase.regionserver.server.flush_time.min
(gauge)

Shown as millisecond
hbase.regionserver.server.flush_time.max
(gauge)

Shown as millisecond
hbase.regionserver.server.flush_time.mean
(gauge)

Shown as millisecond
hbase.regionserver.server.flush_time.median
(gauge)

Shown as millisecond
hbase.regionserver.server.flush_time.percentile.99
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_info_threshold_exceeded
(gauge)
hbase.regionserver.server.delete.num_ops
(gauge)
hbase.regionserver.server.delete.min
(gauge)
hbase.regionserver.server.delete.max
(gauge)
hbase.regionserver.server.delete.mean
(gauge)
hbase.regionserver.server.delete.median
(gauge)
hbase.regionserver.server.delete.percentile.99
(gauge)
hbase.regionserver.server.split_request_count
(gauge)
Number of splits requested
hbase.regionserver.server.split_success_count
(gauge)
Number of successfully executed splits
hbase.regionserver.server.slow_get_count
(gauge)
The number of Gets that took over 1000ms to complete
hbase.regionserver.server.get.num_ops
(gauge)
hbase.regionserver.server.get.min
(gauge)
hbase.regionserver.server.get.max
(gauge)
hbase.regionserver.server.get.mean
(gauge)
hbase.regionserver.server.get.median
(gauge)
hbase.regionserver.server.get.percentile.99
(gauge)
hbase.regionserver.server.scan_next.num_ops
(gauge)
hbase.regionserver.server.scan_next.min
(gauge)
hbase.regionserver.server.scan_next.max
(gauge)
hbase.regionserver.server.scan_next.mean
(gauge)
hbase.regionserver.server.scan_next.median
(gauge)
hbase.regionserver.server.scan_next.percentile.99
(gauge)
hbase.regionserver.server.pause_time_without_gc.num_ops
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_without_gc.min
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_without_gc.max
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_without_gc.mean
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_without_gc.median
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_without_gc.percentile.99
(gauge)

Shown as millisecond
hbase.regionserver.server.slow_put_count
(gauge)
The number of Multis that took over 1000ms to complete
hbase.regionserver.server.slow_increment_count
(gauge)
The number of Increments that took over 1000ms to complete
hbase.regionserver.server.split_time.num_ops
(gauge)

Shown as millisecond
hbase.regionserver.server.split_time.min
(gauge)

Shown as millisecond
hbase.regionserver.server.split_time.max
(gauge)

Shown as millisecond
hbase.regionserver.server.split_time.mean
(gauge)

Shown as millisecond
hbase.regionserver.server.split_time.median
(gauge)

Shown as millisecond
hbase.regionserver.server.split_time.percentile.99
(gauge)

Shown as millisecond
hbase.regionserver.wal.append_size.num_ops
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.append_size.min
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.append_size.max
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.append_size.mean
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.append_size.median
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.append_size.percentile.99
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.sync_time.num_ops
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.sync_time.min
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.sync_time.max
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.sync_time.mean
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.sync_time.median
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.sync_time.percentile.99
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.slow_append_count
(gauge)
Number of appends that were slow.
hbase.regionserver.wal.roll_request
(gauge)
How many times a log roll has been requested total
Shown as millisecond
hbase.regionserver.wal.append_count
(gauge)
Number of appends to the write ahead log.
hbase.regionserver.wal.low_replica_roll_request
(gauge)
How many times a log roll was requested due to too few DN's in the write pipeline.
Shown as millisecond
hbase.regionserver.wal.append_time.num_ops
(gauge)
time an append to the log took.
Shown as millisecond
hbase.regionserver.wal.append_time.min
(gauge)
time an append to the log took.
Shown as millisecond
hbase.regionserver.wal.append_time.max
(gauge)
time an append to the log took.
Shown as millisecond
hbase.regionserver.wal.append_time.mean
(gauge)
time an append to the log took.
Shown as millisecond
hbase.regionserver.wal.append_time.median
(gauge)
time an append to the log took.
Shown as millisecond
hbase.regionserver.wal.append_time.percentile.99
(gauge)
time an append to the log took.
Shown as millisecond
hbase.jvm_metrics.mem_non_heap_used_in_mb
(gauge)
Non-heap memory used in MB
hbase.jvm_metrics.mem_non_heap_committed_in_mb
(gauge)
Non-heap memory committed in MB
hbase.jvm_metrics.mem_non_heap_max_in_mb
(gauge)
Non-heap memory max in MB
hbase.jvm_metrics.mem_heap_used_in_mb
(gauge)
Heap memory used in MB
hbase.jvm_metrics.mem_heap_committed_in_mb
(gauge)
Heap memory committed in MB
hbase.jvm_metrics.mem_heap_max_in_mb
(gauge)
Heap memory max in MB
hbase.jvm_metrics.mem_max_in_mb
(gauge)
Max memory size in MB
hbase.jvm_metrics.gc_count_par_new
(gauge)
GC Count for ParNew
hbase.jvm_metrics.gc_time_millis_par_new
(gauge)
GC Time for ParNew
Shown as millisecond
hbase.jvm_metrics.gc_count_concurrent_mark_sweep
(gauge)
GC Count for ConcurrentMarkSweep
hbase.jvm_metrics.gc_time_millis_concurrent_mark_sweep
(gauge)
GC Time for ConcurrentMarkSweep
Shown as millisecond
hbase.jvm_metrics.gc_count
(gauge)
Total GC count
hbase.jvm_metrics.gc_time_millis
(gauge)
Total GC time in milliseconds
Shown as millisecond

Événements

Le check HBase RegionServer n’inclut aucun événement.

Checks de service

Le check HBase RegionServer n’inclut aucun check de service.

Dépannage

Besoin d’aide ? Contactez l’assistance Datadog.