Hbase Master

Supported OS Linux Mac OS Windows

Integrationv1.1.0

概要

Hbase_master サービスからメトリクスをリアルタイムに取得して

  • Hbase_master の状態を視覚化および監視できます。
  • Hbase_master のフェイルオーバーとイベントの通知を受けることができます。

セットアップ

Hbase_master チェックは Datadog Agent パッケージに含まれていないため、お客様自身でインストールする必要があります。

インストール

Agent v7.21 / v6.21 以降の場合は、下記の手順に従い Hbase_master チェックをホストにインストールします。Docker Agent または 上記バージョン以前の Agent でインストールする場合は、コミュニティインテグレーションの使用をご参照ください。

  1. 以下のコマンドを実行して、Agent インテグレーションをインストールします。

    datadog-agent integration install -t datadog-hbase_master==<INTEGRATION_VERSION>
    
  2. コアのインテグレーションと同様にインテグレーションを構成します。

コンフィギュレーション

  1. Hbase_master のメトリクスを収集するには、Agent のコンフィギュレーションディレクトリのルートにある conf.d/ フォルダーで hbase_master.d/conf.yaml ファイルを編集します。使用可能なすべてのコンフィギュレーションオプションについては、サンプル hbase_master.d/conf.yaml を参照してください。

    : Agent 6 を使用する場合は、hbase_master.d/metrics.yaml ファイルを修正して boolean キーを引用符で囲みます。

      - include:
          domain: Hadoop
          bean:
            - Hadoop:service=HBase,name=Master,sub=Server
          attribute:
            # Is Active Master
            tag.isActiveMaster:
               metric_type: gauge
               alias: hbase.master.server.tag.is_active_master
               values: {"true": 1, "false": 0, default: 0}
    
  2. Agent を再起動します

ログの収集

  1. Datadog Agent で、ログの収集はデフォルトで無効になっています。以下のように、datadog.yaml でこれを有効にする必要があります。

    logs_enabled: true
    
  2. Hbase_master ログの収集を開始するには、次のコンフィギュレーションブロックを hbase_master.d/conf.yaml ファイルに追加します。

    logs:
      - type: file
        path: /path/to/my/directory/file.log
        source: hbase
    

    path のパラメーター値を変更し、環境に合わせて構成してください。 使用可能なすべての構成オプションの詳細については、サンプル hbase_master.d/conf.yaml を参照してください。

  3. Agent を再起動します

検証

Agent の status サブコマンドを実行し、Checks セクションで hbase_master を探します。

収集データ

メトリクス

hbase.master.assignmentmanager.rit_oldest_age
(gauge)
The age of the longest region in transition, in milliseconds
Shown as millisecond
hbase.master.assignmentmanager.rit_count_over_threshold
(gauge)
The number of regions that have been in transition longer than a threshold time
hbase.master.assignmentmanager.rit_count
(gauge)
The number of regions in transition
hbase.master.assignmentmanager.assign.min
(gauge)
hbase.master.assignmentmanager.assign.max
(gauge)
hbase.master.assignmentmanager.assign.mean
(gauge)
hbase.master.assignmentmanager.assign.median
(gauge)
hbase.master.assignmentmanager.assign.percentile.99
(gauge)
hbase.master.ipc.queue_size
(gauge)
Number of bytes in the call queues.
Shown as byte
hbase.master.ipc.num_calls_in_general_queue
(gauge)
Number of calls in the general call queue.
hbase.master.ipc.num_calls_in_replication_queue
(gauge)
Number of calls in the replication call queue.
hbase.master.ipc.num_calls_in_priority_queue
(gauge)
Number of calls in the priority call queue.
hbase.master.ipc.num_open_connections
(gauge)
Number of open connections.
hbase.master.ipc.num_active_handler
(gauge)
Number of active rpc handlers.
hbase.master.ipc.total_call_time.max
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.master.ipc.total_call_time.mean
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.master.ipc.total_call_time.median
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.master.ipc.total_call_time.percentile.99
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.master.server.tag.is_active_master
(gauge)
Is Active Master
hbase.master.server.num_region_servers
(gauge)
Number of RegionServers
hbase.master.server.num_dead_region_servers
(gauge)
Number of dead RegionServers

イベント

Hbase_master チェックには、イベントは含まれません。

サービスのチェック

Hbase_master チェックには、サービスのチェック機能は含まれません。

トラブルシューティング

ご不明な点は、Datadog のサポートチームまでお問い合わせください。

HBase RegionServer インテグレーション

概要

HBase RegionServer サービスからメトリクスをリアルタイムに取得して

  • HBase RegionServer の状態を視覚化および監視できます。
  • HBase RegionServer のフェイルオーバーとイベントの通知を受けることができます。

セットアップ

HBase RegionServer チェックは Datadog Agent パッケージに含まれていないため、お客様自身でインストールする必要があります。

インストール

Agent v7.21 / v6.21 以降の場合は、下記の手順に従い HBase RegionServer チェックをホストにインストールします。Docker Agent または 上記バージョン以前の Agent でインストールする場合は、コミュニティインテグレーションの使用をご参照ください。

  1. 以下のコマンドを実行して、Agent インテグレーションをインストールします。

    datadog-agent integration install -t datadog-hbase_regionserver==<INTEGRATION_VERSION>
    
  2. コアのインテグレーションと同様にインテグレーションを構成します。

コンフィギュレーション

  1. Hbase RegionServer のメトリクスを収集するには、Agent のコンフィギュレーションディレクトリのルートにある conf.d/ フォルダーで hbase_regionserver.d/conf.yaml ファイルを編集します。使用可能なすべてのコンフィギュレーションオプションについては、サンプル hbase_regionserver.d/conf.yaml を参照してください。

  2. Agent を再起動します

ログの収集

  1. Datadog Agent で、ログの収集はデフォルトで無効になっています。以下のように、datadog.yaml でこれを有効にする必要があります。

    logs_enabled: true
    
  2. Hbase_regionserver ログの収集を開始するには、次のコンフィギュレーションブロックを hbase_regionserver.d/conf.yaml ファイルに追加します。

    logs:
      - type: file
        path: /path/to/my/directory/file.log
        source: hbase
    

    path のパラメーター値を変更し、環境に合わせて構成してください。 使用可能なすべてのコンフィギュレーションオプションについては、サンプル hbase_regionserver.d/conf.yaml を参照してください。

  3. Agent を再起動します

検証

Agent の status サブコマンドを実行し、Checks セクションで hbase_regionserver を探します。

収集データ

メトリクス

hbase.regionserver.ipc.queue_size
(gauge)
Number of bytes in the call queues.
Shown as byte
hbase.regionserver.ipc.num_open_connections
(gauge)
Number of open connections.
hbase.regionserver.ipc.num_active_handler
(gauge)
Number of active rpc handlers.
hbase.regionserver.ipc.total_call_time.max
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.regionserver.ipc.total_call_time.mean
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.regionserver.ipc.total_call_time.median
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.regionserver.ipc.total_call_time.percentile.99
(gauge)
total call time, including both queued and processing time.
Shown as millisecond
hbase.regionserver.regions.num_regions
(gauge)
Number of regions in the metrics system
hbase.regionserver.replication.sink.applied_ops
(gauge)
Number of WAL entries applied on replication sink.
hbase.regionserver.replication.sink.age_of_last_applied_op
(gauge)
Replication time lag of last applied WAL entry between source and sink.
Shown as millisecond
hbase.regionserver.replication.sink.applied_batches
(gauge)
Number of WAL applying operations processd on replication sink.
hbase.regionserver.server.region_count
(gauge)
Number of regions
hbase.regionserver.server.store_count
(gauge)
Number of Stores
hbase.regionserver.server.hlog_file_count
(gauge)
Number of WAL Files
hbase.regionserver.server.hlog_file_size
(gauge)
Size of all WAL Files
Shown as byte
hbase.regionserver.server.store_file_count
(gauge)
Number of Store Files
hbase.regionserver.server.mem_store_size
(gauge)
Size of the memstore
Shown as byte
hbase.regionserver.server.store_file_size
(gauge)
Size of storefiles being served.
Shown as byte
hbase.regionserver.server.total_request_count
(gauge)
Total number of requests this RegionServer has answered.
hbase.regionserver.server.read_request_count
(gauge)
Number of read requests this region server has answered.
hbase.regionserver.server.write_request_count
(gauge)
Number of mutation requests this region server has answered.
hbase.regionserver.server.check_mutate_failed_count
(gauge)
Number of Check and Mutate calls that failed the checks.
hbase.regionserver.server.check_mutate_passed_count
(gauge)
Number of Check and Mutate calls that passed the checks.
hbase.regionserver.server.store_file_index_size
(gauge)
Size of indexes in storefiles on disk.
Shown as byte
hbase.regionserver.server.static_index_size
(gauge)
Uncompressed size of the static indexes.
Shown as byte
hbase.regionserver.server.static_bloom_size
(gauge)
Uncompressed size of the static bloom filters.
Shown as byte
hbase.regionserver.server.mutations_without_wal_count
(count)
Number of mutations that have been sent by clients with the write ahead logging turned off.
hbase.regionserver.server.mutations_without_wal_size
(gauge)
Size of data that has been sent by clients with the write ahead logging turned off.
Shown as byte
hbase.regionserver.server.percent_files_local
(gauge)
The percent of HFiles that are stored on the local hdfs data node.
Shown as percent
hbase.regionserver.server.percent_files_local_secondary_regions
(gauge)
The percent of HFiles used by secondary regions that are stored on the local hdfs data node.
Shown as percent
hbase.regionserver.server.split_queue_length
(gauge)
Length of the queue for splits.
hbase.regionserver.server.compaction_queue_length
(gauge)
Length of the queue for compactions.
hbase.regionserver.server.flush_queue_length
(gauge)
Length of the queue for region flushes
hbase.regionserver.server.block_cache_free_size
(gauge)
Size of the block cache that is not occupied.
Shown as byte
hbase.regionserver.server.block_cache_count
(gauge)
Number of block in the block cache.
hbase.regionserver.server.block_cache_size
(gauge)
Size of the block cache.
Shown as byte
hbase.regionserver.server.block_cache_hit_count
(gauge)
Count of the hit on the block cache.
hbase.regionserver.server.block_cache_hit_count_primary
(gauge)
Count of hit on primary replica in the block cache.
hbase.regionserver.server.block_cache_miss_count
(gauge)
Number of requests for a block that missed the block cache.
hbase.regionserver.server.block_cache_miss_count_primary
(gauge)
Number of requests for a block of primary replica that missed the block cache.
hbase.regionserver.server.block_cache_eviction_count
(gauge)
Count of the number of blocks evicted from the block cache.
hbase.regionserver.server.block_cache_eviction_count_primary
(gauge)
Count of the number of blocks evicted from primary replica in the block cache.
hbase.regionserver.server.block_cache_hit_percent
(gauge)
Percent of block cache requests that are hits
Shown as percent
hbase.regionserver.server.block_cache_express_hit_percent
(gauge)
The percent of the time that requests with the cache turned on hit the cache.
Shown as percent
hbase.regionserver.server.block_cache_failed_insertion_count
(gauge)
Number of times that a block cache insertion failed. Usually due to size restrictions.
Shown as millisecond
hbase.regionserver.server.updates_blocked_time
(gauge)
Number of MS updates have been blocked so that the memstore can be flushed.
Shown as millisecond
hbase.regionserver.server.flushed_cells_count
(gauge)
The number of cells flushed to disk
hbase.regionserver.server.compacted_cells_count
(gauge)
The number of cells processed during minor compactions
hbase.regionserver.server.major_compacted_cells_count
(gauge)
The number of cells processed during major compactions
hbase.regionserver.server.flushed_cells_size
(gauge)
The total amount of data flushed to disk, in bytes
Shown as byte
hbase.regionserver.server.compacted_cells_size
(gauge)
The total amount of data processed during minor compactions, in bytes
Shown as byte
hbase.regionserver.server.major_compacted_cells_size
(gauge)
The total amount of data processed during major compactions, in bytes
Shown as byte
hbase.regionserver.server.blocked_request_count
(gauge)
The number of blocked requests because of memstore size is larger than blockingMemStoreSize
hbase.regionserver.server.hedged_read
(gauge)
hbase.regionserver.server.hedged_read_wins
(gauge)
hbase.regionserver.server.pause_time_with_gc_num_ops
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_with_gc.min
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_with_gc.max
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_with_gc.mean
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_with_gc.median
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_with_gc.percentile.99
(gauge)

Shown as millisecond
hbase.regionserver.server.mutate.num_ops
(gauge)
hbase.regionserver.server.mutate.min
(gauge)
hbase.regionserver.server.mutate.max
(gauge)
hbase.regionserver.server.mutate.mean
(gauge)
hbase.regionserver.server.mutate.median
(gauge)
hbase.regionserver.server.mutate.percentile.99
(gauge)
hbase.regionserver.server.slow_append_count
(gauge)
The number of Appends that took over 1000ms to complete
hbase.regionserver.server.pause_warn_threshold_exceeded
(gauge)
hbase.regionserver.server.slow_delete_count
(gauge)
The number of Deletes that took over 1000ms to complete
hbase.regionserver.server.increment.num_ops
(gauge)
hbase.regionserver.server.increment.min
(gauge)
hbase.regionserver.server.increment.max
(gauge)
hbase.regionserver.server.increment.mean
(gauge)
hbase.regionserver.server.increment.median
(gauge)
hbase.regionserver.server.increment.percentile.99
(gauge)
hbase.regionserver.server.replay.num_ops
(gauge)
hbase.regionserver.server.replay.min
(gauge)
hbase.regionserver.server.replay.max
(gauge)
hbase.regionserver.server.replay.mean
(gauge)
hbase.regionserver.server.replay.median
(gauge)
hbase.regionserver.server.replay.percentile.99
(gauge)
hbase.regionserver.server.flush_time.num_ops
(gauge)

Shown as millisecond
hbase.regionserver.server.flush_time.min
(gauge)

Shown as millisecond
hbase.regionserver.server.flush_time.max
(gauge)

Shown as millisecond
hbase.regionserver.server.flush_time.mean
(gauge)

Shown as millisecond
hbase.regionserver.server.flush_time.median
(gauge)

Shown as millisecond
hbase.regionserver.server.flush_time.percentile.99
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_info_threshold_exceeded
(gauge)
hbase.regionserver.server.delete.num_ops
(gauge)
hbase.regionserver.server.delete.min
(gauge)
hbase.regionserver.server.delete.max
(gauge)
hbase.regionserver.server.delete.mean
(gauge)
hbase.regionserver.server.delete.median
(gauge)
hbase.regionserver.server.delete.percentile.99
(gauge)
hbase.regionserver.server.split_request_count
(gauge)
Number of splits requested
hbase.regionserver.server.split_success_count
(gauge)
Number of successfully executed splits
hbase.regionserver.server.slow_get_count
(gauge)
The number of Gets that took over 1000ms to complete
hbase.regionserver.server.get.num_ops
(gauge)
hbase.regionserver.server.get.min
(gauge)
hbase.regionserver.server.get.max
(gauge)
hbase.regionserver.server.get.mean
(gauge)
hbase.regionserver.server.get.median
(gauge)
hbase.regionserver.server.get.percentile.99
(gauge)
hbase.regionserver.server.scan_next.num_ops
(gauge)
hbase.regionserver.server.scan_next.min
(gauge)
hbase.regionserver.server.scan_next.max
(gauge)
hbase.regionserver.server.scan_next.mean
(gauge)
hbase.regionserver.server.scan_next.median
(gauge)
hbase.regionserver.server.scan_next.percentile.99
(gauge)
hbase.regionserver.server.pause_time_without_gc.num_ops
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_without_gc.min
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_without_gc.max
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_without_gc.mean
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_without_gc.median
(gauge)

Shown as millisecond
hbase.regionserver.server.pause_time_without_gc.percentile.99
(gauge)

Shown as millisecond
hbase.regionserver.server.slow_put_count
(gauge)
The number of Multis that took over 1000ms to complete
hbase.regionserver.server.slow_increment_count
(gauge)
The number of Increments that took over 1000ms to complete
hbase.regionserver.server.split_time.num_ops
(gauge)

Shown as millisecond
hbase.regionserver.server.split_time.min
(gauge)

Shown as millisecond
hbase.regionserver.server.split_time.max
(gauge)

Shown as millisecond
hbase.regionserver.server.split_time.mean
(gauge)

Shown as millisecond
hbase.regionserver.server.split_time.median
(gauge)

Shown as millisecond
hbase.regionserver.server.split_time.percentile.99
(gauge)

Shown as millisecond
hbase.regionserver.wal.append_size.num_ops
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.append_size.min
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.append_size.max
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.append_size.mean
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.append_size.median
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.append_size.percentile.99
(gauge)
size (in bytes) of the data appended to the WAL.
Shown as byte
hbase.regionserver.wal.sync_time.num_ops
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.sync_time.min
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.sync_time.max
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.sync_time.mean
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.sync_time.median
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.sync_time.percentile.99
(gauge)
the time it took to sync the WAL to HDFS.
Shown as millisecond
hbase.regionserver.wal.slow_append_count
(gauge)
Number of appends that were slow.
hbase.regionserver.wal.roll_request
(gauge)
How many times a log roll has been requested total
Shown as millisecond
hbase.regionserver.wal.append_count
(gauge)
Number of appends to the write ahead log.
hbase.regionserver.wal.low_replica_roll_request
(gauge)
How many times a log roll was requested due to too few DN's in the write pipeline.
Shown as millisecond
hbase.regionserver.wal.append_time.num_ops
(gauge)
time an append to the log took.
Shown as millisecond
hbase.regionserver.wal.append_time.min
(gauge)
time an append to the log took.
Shown as millisecond
hbase.regionserver.wal.append_time.max
(gauge)
time an append to the log took.
Shown as millisecond
hbase.regionserver.wal.append_time.mean
(gauge)
time an append to the log took.
Shown as millisecond
hbase.regionserver.wal.append_time.median
(gauge)
time an append to the log took.
Shown as millisecond
hbase.regionserver.wal.append_time.percentile.99
(gauge)
time an append to the log took.
Shown as millisecond
hbase.jvm_metrics.mem_non_heap_used_in_mb
(gauge)
Non-heap memory used in MB
hbase.jvm_metrics.mem_non_heap_committed_in_mb
(gauge)
Non-heap memory committed in MB
hbase.jvm_metrics.mem_non_heap_max_in_mb
(gauge)
Non-heap memory max in MB
hbase.jvm_metrics.mem_heap_used_in_mb
(gauge)
Heap memory used in MB
hbase.jvm_metrics.mem_heap_committed_in_mb
(gauge)
Heap memory committed in MB
hbase.jvm_metrics.mem_heap_max_in_mb
(gauge)
Heap memory max in MB
hbase.jvm_metrics.mem_max_in_mb
(gauge)
Max memory size in MB
hbase.jvm_metrics.gc_count_par_new
(gauge)
GC Count for ParNew
hbase.jvm_metrics.gc_time_millis_par_new
(gauge)
GC Time for ParNew
Shown as millisecond
hbase.jvm_metrics.gc_count_concurrent_mark_sweep
(gauge)
GC Count for ConcurrentMarkSweep
hbase.jvm_metrics.gc_time_millis_concurrent_mark_sweep
(gauge)
GC Time for ConcurrentMarkSweep
Shown as millisecond
hbase.jvm_metrics.gc_count
(gauge)
Total GC count
hbase.jvm_metrics.gc_time_millis
(gauge)
Total GC time in milliseconds
Shown as millisecond

イベント

HBase RegionServer チェックには、イベントは含まれません。

サービスのチェック

HBase RegionServer チェックには、サービスのチェック機能は含まれません。

トラブルシューティング

ご不明な点は、Datadog のサポートチームまでお問い合わせください。