- 重要な情報
- はじめに
- 用語集
- ガイド
- エージェント
- インテグレーション
- OpenTelemetry
- 開発者
- API
- CoScreen
- アプリ内
- Service Management
- インフラストラクチャー
- アプリケーションパフォーマンス
- 継続的インテグレーション
- ログ管理
- セキュリティ
- UX モニタリング
- 管理
Supported OS
このチェックは、Datadog Agent を通して MapR 6.1 以降を監視します。
以下の手順に従って、このチェックをインストールし、ホストで実行中の Agent に対して構成します。
MapR チェックは Datadog Agent パッケージに含まれていますが、追加のセットアップが必要です。
/var/mapr/mapr.monitoring/metricstreams
ストリームで ‘consume’ を許可された利用可能な MapR ユーザー (ユーザー名、パスワード、UID、GID あり) がある。既存のユーザーの場合と、新規作成ユーザーの場合があります。dd-agent
ユーザーがこの MapR ユーザーを偽装できるようにします。dd-agent
ユーザーが読み出せるこのユーザー専用の長期的なサービスチケットを生成します。ノード別インストールステップ
次のコマンドを使用して、ライブラリ mapr-streams-library をインストールします。
sudo -u dd-agent /opt/datadog-agent/embedded/bin/pip install --global-option=build_ext --global-option="--library-dirs=/opt/mapr/lib" --global-option="--include-dirs=/opt/mapr/include/" mapr-streams-python
.
Python 3 で Agent v7 を使用されている場合は、pip
を pip3
に置き換えます。
/etc/ld.so.conf
(または /etc/ld.so.conf.d/
内のファイル) に /opt/mapr/lib/
を追加します。これは、Agent が使用する mapr-streams-library で MapR 共有ライブラリを探すために必要です。
sudo ldconfig
を実行してライブラリを再読み込みします。
チケットのロケーションを指定して、インテグレーションを構成します。
sudo -u dd-agent /opt/datadog-agent/embedded/bin/pip wheel --global-option=build_ext --global-option="--library-dirs=/opt/mapr/lib" --global-option="--include-dirs=/opt/mapr/include/" mapr-streams-python
を実行して、開発マシンで Wheel ファイルを作成できます。次に、本番マシンで sudo -u dd-agent /opt/datadog-agent/embedded/bin/pip install <WHEEL_ファイル>
を実行します。pip
を pip3
に置き換えてください。conf.d/
フォルダーで mapr.d/conf.yaml
ファイルを編集し、MapR パフォーマンスデータを収集します。利用可能なコンフィギュレーションオプションについては、mapr.d/conf.yaml のサンプルを参照してください。ticket_location
パラメーターを設定します。MapR はログに fluentD を使用します。fluentD Datadog プラグインを使用して、MapR ログを収集します。下記のコマンドを使用して、プラグインをダウンロードし、適切なディレクトリにインストールします。
curl https://raw.githubusercontent.com/DataDog/fluent-plugin-datadog/master/lib/fluent/plugin/out_datadog.rb -o /opt/mapr/fluentd/fluentd-<VERSION>/lib/fluentd-<VERSION>-linux-x86_64/lib/app/lib/fluent/plugin/out_datadog.rb
次に、下記のセクションを使用して /opt/mapr/fluentd/fluentd-<VERSION>/etc/fluentd/fluentd.conf
をアップデートします。
<match *>
@type copy
<store> # デフォルトではこのセクションの位置はここです。このセクションから Kibana の ElasticCach にログを送信します。
@include /opt/mapr/fluentd/fluentd-<VERSION>/etc/fluentd/es_config.conf
include_tag_key true
tag_key service_name
</store>
<store> # また、このセクションはすべてログを Datadog に転送します:
@type datadog
@id dd_agent
include_tag_key true
dd_source mapr # Sets "source: mapr" on every log to allow automatic parsing on Datadog.
dd_tags "<KEY>:<VALUE>"
service <サービス名>
api_key <API_キー>
</store>
使用可能なオプションの詳細については、fluent_datadog_plugin を参照してください。
Agent の status サブコマンドを実行し、Checks セクションで mapr
を探します。
mapr.metrics.submitted (gauge) | Number of metrics submitted every check run. |
mapr.drill.allocator_root_used (gauge) | The amount of memory used in bytes by the internal memory allocator. Shown as byte |
mapr.drill.allocator_root_peak (gauge) | The peak amount of memory used in bytes by the internal memory allocator. Shown as byte |
mapr.drill.blocked_count (gauge) | The number of threads that are blocked because they are waiting for a monitor lock. Shown as thread |
mapr.drill.count (gauge) | The number of live threads (including both daemon and non-daemon threads). Shown as thread |
mapr.drill.fd_usage (gauge) | The ratio of used to total file descriptors. |
mapr.drill.fragments_running (gauge) | The number of query fragments currently running in the drillbit. Shown as byte |
mapr.drill.heap_used (gauge) | The amount of heap memory used in bytes by the JVM. Shown as byte |
mapr.drill.non_heap_used (gauge) | The amount of non-heap memory used in bytes by the JVM. Shown as byte |
mapr.drill.queries_completed (count) | The number of completed, canceled or failed queries for which this drillbit is the foreman. Shown as byte |
mapr.drill.queries_running (gauge) | The number of running queries for which this drillbit is the foreman. Shown as byte |
mapr.drill.runnable_count (gauge) | The number of threads executing in the JVM. Shown as thread |
mapr.drill.waiting_count (gauge) | The number of threads that are waiting to be executed. This can occur when a thread must wait for another thread to perform an action before proceeding. Shown as thread |
mapr.alarms.alarm_raised (gauge) | The number of threads that are waiting to be executed. This can occur when a thread must wait for another thread to perform an action before proceeding. Shown as thread |
mapr.fs.reads (count) | The number of remote reads. Shown as read |
mapr.fs.writes (count) | The number of remote writes. Shown as write |
mapr.fs.read_bytes (count) | The amount of data read remotely in MB. Shown as mebibyte |
mapr.fs.bulk_writes (count) | The number of bulk-write operations. Bulk-write operations occur when the MapR filesystem container master aggregates multiple file writes from one or more clients into one RPC before replicating the writes. Shown as write |
mapr.fs.local_reads (count) | The number of file read operations by applications that are running on the MapR filesystem node. Shown as read |
mapr.fs.write_bytes (count) | The amount of data written remotely in MB. Shown as mebibyte |
mapr.fs.kvstore_scan (count) | The number of scan operations on key-value store files which are used by the CLDB and MapR database. Shown as operation |
mapr.fs.local_readbytes (count) | The number of bytes read by applications that are running on the MapR filesystem node. Shown as byte |
mapr.fs.local_writes (count) | The number of file write operations by applications that are running on the MapR filesystem node. Shown as operation |
mapr.fs.kvstore_delete (count) | The number of delete operations on key-value store files which are used by the CLDB and MapR database. Shown as operation |
mapr.fs.kvstore_insert (count) | The number of insert operations on key-value store files which are used by the CLDB and MapR database. Shown as operation |
mapr.fs.kvstore_lookup (count) | The number of lookup operations on key-value store files which are used by the CLDB and MapR database. Shown as operation |
mapr.fs.read_cachehits (count) | The number of cache hits for file reads. This value includes pages that the MapR filesystem populates using readahead mechanism. Shown as hit |
mapr.fs.bulk_writesbytes (count) | The number of bytes written by bulk-write operations. Bulk-write operations occur when the MapR filesystem container master aggregates multiple file writes from one or more clients into one RPC before replicating the writes. Shown as byte |
mapr.fs.read_cachemisses (count) | The number of cache misses for file read operations. Shown as miss |
mapr.fs.statstype_create (count) | The number of file create operations. Shown as operation |
mapr.fs.statstype_lookup (count) | The number of lookup operations. Shown as operation |
mapr.fs.statstype_read (count) | The number of file read operations. Shown as read |
mapr.fs.statstype_write (count) | The number of file write operations. Shown as write |
mapr.fs.local_writebytes (count) | The number of bytes written by applications that are running on the MapR filesystem node. Shown as byte |
mapr.cache.misses_dir (count) | The number of cache misses on the table LRU cache. Shown as miss |
mapr.cache.lookups_dir (count) | The number of cache lookups in the table LRU cache. The table LRU is used for storing internal B-Tree leaf pages. Shown as operation |
mapr.cache.misses_data (count) | The number of cache misses in the block cache. Shown as miss |
mapr.cache.misses_meta (count) | The number of cache misses on the meta LRU cache. Shown as miss |
mapr.cache.lookups_data (count) | The number of cache lookups in the block cache. Shown as operation |
mapr.cache.lookups_meta (count) | The number of cache lookups on the meta LRU cache. The meta LRU is used for storing internal B-Tree pages. Shown as operation |
mapr.cache.misses_inode (count) | The number of cache misses in the inode cache. Shown as miss |
mapr.cache.misses_table (count) | The number of cache misses on the table LRU cache. Shown as miss |
mapr.cache.lookups_inode (count) | The number of cache lookups in the inode cache. |
mapr.cache.lookups_table (count) | The number of cache lookups in the table LRU cache. The table LRU is used for storing internal B-Tree leaf pages. Shown as operation |
mapr.cache.misses_largefile (count) | The number of cache misses on the large file LRU cache. Shown as miss |
mapr.cache.misses_smallfile (count) | The number of cache misses on the small file LRU cache. Shown as miss |
mapr.cache.lookups_largefile (count) | The number of cache lookups in the large file LRU cache. The large file LRU is used for storing files with size greater than 64K and MapR database data pages. Shown as operation |
mapr.cache.lookups_smallfile (count) | The number of cache lookups on the small file LRU cache. This LRU is used for storing files with size less than 64K and MapR database index pages. Shown as operation |
mapr.cldb.cluster_cpu_total (gauge) | The number of physical CPUs in the cluster. Shown as cpu |
mapr.cldb.cluster_cpubusy_percent (gauge) | The aggregate percentage of busy CPUs in the cluster. Shown as percent |
mapr.cldb.cluster_disk_capacity (gauge) | The storage capacity for MapR disks in GB. Shown as gibibyte |
mapr.cldb.cluster_diskspace_used (gauge) | The amount of MapR disks used in GB. Shown as gibibyte |
mapr.cldb.cluster_memory_capacity (gauge) | The memory capacity in MB. Shown as mebibyte |
mapr.cldb.cluster_memory_used (gauge) | The amount of used memory in MB. Shown as mebibyte |
mapr.cldb.containers (gauge) | The number of containers currently in the cluster. Shown as container |
mapr.cldb.containers_created (count) | The cumulative number of containers created in the cluster. This value includes containers that have been deleted. Shown as container |
mapr.cldb.containers_unusable (gauge) | The number of containers that are no longer usable. The CLDB marks a container as unusable when the node that stores the container is offline for 1 hour or more. Shown as container |
mapr.cldb.disk_space_available (gauge) | The amount of disk space available in GB. Shown as gibibyte |
mapr.cldb.nodes_in_cluster (gauge) | The number of nodes in the cluster. Shown as node |
mapr.cldb.nodes_offline (gauge) | The number of nodes in the cluster that are offline. Shown as node |
mapr.cldb.rpc_received (count) | The number of RPCs received. Shown as operation |
mapr.cldb.rpcs_failed (count) | The number of RPCs failed. Shown as operation |
mapr.cldb.storage_pools_cluster (gauge) | The number of storage pools. |
mapr.cldb.storage_pools_offline (gauge) | The number of offline storage pools. |
mapr.cldb.volumes (gauge) | The number of volumes created, including system volumes. Shown as volume |
mapr.db.flushes (count) | The number of flushes that reorganize data from bucket files (unsorted data) to spill files (sorted data) when the bucket size exceeds a threshold. Shown as flush |
mapr.db.forceflushes (count) | The number of flushes that reorganize data from bucket files (unsorted data) to spill files (sorted data) when the in-memory bucket file cache fills up. Shown as flush |
mapr.db.get_rpcs (count) | The number of MapR database get RPCs completed Shown as operation |
mapr.db.get_bytes (count) | The number of bytes read by get RPCs Shown as byte |
mapr.db.put_bytes (count) | The number of bytes written by put RPCs Shown as byte |
mapr.db.scan_rpcs (count) | The number of MapR Database scan RPCs completed Shown as operation |
mapr.db.scan_bytes (count) | The number of bytes read by scan RPCs Shown as byte |
mapr.db.append_rpcs (count) | The number of MapR Database append RPCs completed Shown as operation |
mapr.db.get_currpcs (gauge) | The number of MapR Database get RPCs in progress Shown as operation |
mapr.db.put_currpcs (gauge) | The number of MapR Database put RPCs in progress Shown as operation |
mapr.db.put_rpcs (count) | The number of MapR Database put RPCs completed Shown as operation |
mapr.db.put_rpcrows (count) | The number of rows written by put RPCs. Each MapR Database put RPC can include multiple put rows. Shown as object |
mapr.db.ttlcompacts (count) | The number of compactions that result in reclamation of disk space due to removal of stale data. Shown as operation |
mapr.db.fullcompacts (count) | The number of compactions that combine multiple MapR Database data files containing sorted data (known as spills) into a single spill file. Shown as operation |
mapr.db.get_readrows (count) | The number of rows read by get RPCs Shown as object |
mapr.db.get_resprows (count) | The number of rows returned from get RPCs Shown as object |
mapr.db.minicompacts (count) | The number of compactions that combine multiple small data files containing sorted data (known as spills) into a single spill file. Shown as operation |
mapr.db.put_readrows (count) | The number of rows read by put RPCs Shown as object |
mapr.db.scan_currpcs (gauge) | The number of MapR Database scan RPCs in progress Shown as operation |
mapr.db.scan_resprows (count) | The number of rows returned from scan RPCs. Shown as object |
mapr.db.scan_readrows (count) | The number of rows read by scan RPCs Shown as object |
mapr.db.append_rpcrows (count) | The number of rows written by append RPCs Shown as object |
mapr.db.increment_rpcs (count) | The number of MapR Database increment RPCs completed Shown as operation |
mapr.db.increment_bytes (count) | The number of bytes written by increment RPCs Shown as byte |
mapr.db.valuecache_hits (count) | The number of MapR Database operations that utilized the MapR Database value cache Shown as operation |
mapr.db.checkandput_rpcs (count) | The number of MapR Database check and put RPCs completed Shown as operation |
mapr.db.checkandput_bytes (count) | The number of bytes written by check and put RPCs Shown as byte |
mapr.db.increment_rpcrows (count) | The number of rows written by increment RPCs Shown as object |
mapr.db.updateandget_rpcs (count) | The number of MapR Database update and get RPCs completed Shown as operation |
mapr.db.updateandget_bytes (count) | The number of bytes written by update and get RPCs Shown as byte |
mapr.db.valuecache_lookups (count) | The number of MapR Database operations that performed a lookup on the MapR Database value cache Shown as operation |
mapr.db.checkandput_rpcrows (count) | The number of rows written by check and put RPCs Shown as object |
mapr.db.valuecache_usedSize (gauge) | The MapR Database value cache size in MB Shown as mebibyte |
mapr.db.append_bytes (count) | The number of bytes written by append RPCs Shown as byte |
mapr.db.updateandget_rpcrows (count) | The number of rows written by update and get RPCs Shown as object |
mapr.db.cdc.sent_bytes (count) | The number of bytes of CDC data sent Shown as byte |
mapr.db.cdc.pending_bytes (gauge) | The number of bytes of CDC data remaining to be sent Shown as byte |
mapr.db.repl.sent_bytes (count) | The number of bytes sent to replicate data Shown as byte |
mapr.db.repl.pending_bytes (gauge) | The number of bytes of replication data remaining to be sent Shown as byte |
mapr.db.table.latency (gauge) | The latency of RPC operations on tables,represented as a histogram. Endpoints identify histogram bucket boundaries. Shown as millisecond |
mapr.db.table.read_rows (count) | The number of rows read from tables Shown as object |
mapr.db.table.resp_rows (count) | The number of rows returned from tables Shown as object |
mapr.db.table.read_bytes (count) | The number of bytes read from tables Shown as byte |
mapr.db.table.rpcs (count) | The number of RPC calls completed on tables Shown as operation |
mapr.db.table.write_rows (count) | The number of rows written to tables Shown as object |
mapr.db.table.write_bytes (count) | The number of bytes written to tables Shown as byte |
mapr.db.table.value_cache_hits (count) | The number of MapR Database operations on tables that utilized the MapR Database value cache Shown as operation |
mapr.db.table.value_cache_lookups (count) | The number of MapR Database operations on tables that performed a lookup on the MapR Database value cache Shown as operation |
mapr.db.index.pending_bytes (gauge) | The number of bytes of secondary index data remaining to be sent Shown as byte |
mapr.io.write_bytes (count) | The number of MB written to disk. Shown as mebibyte |
mapr.io.writes (count) | The number of MapR Filesystem disk write operations. Shown as write |
mapr.io.reads (gauge) | The number of MapR Filesystem disk read operations. Shown as read |
mapr.io.read_bytes (gauge) | The number of MB read from disk. Shown as mebibyte |
mapr.process.vm (gauge) | The amount of virtual memory in MB used by MapR processes. Shown as mebibyte |
mapr.process.rss (gauge) | The actual amount of memory in MB used by MapR processes. Shown as mebibyte |
mapr.process.data (gauge) | The amount memory in MB used by the data segments of MapR processes. Shown as mebibyte |
mapr.process.cpu_percent (gauge) | The percentage of CPU used for MapR processes. Shown as percent |
mapr.process.mem_percent (gauge) | The percentage of total system memory (not capped by MapR processes) used for MapR processes. Shown as percent |
mapr.process.cpu_time.syst (count) | The amount of time measured in seconds that the process has been in kernel mode. Shown as second |
mapr.process.cpu_time.user (count) | The amount of time measured in seconds that the process has been in user mode Shown as second |
mapr.process.disk_ops.read (count) | The number of read operations for MapR processes. Shown as read |
mapr.process.disk_ops.write (count) | The number of write operations for MapR processes. Shown as write |
mapr.process.disk_octets.read (count) | The number of bytes read from disk for MapR processes. Shown as byte |
mapr.process.disk_octets.write (count) | The number of bytes written to disk for MapR processes. Shown as byte |
mapr.process.page_faults.majflt (count) | The number of major MapR process faults that required loading a memory page from disk. Shown as error |
mapr.process.page_faults.minflt (count) | The number of minor MapR process faults that required loading a memory page from disk. Shown as error |
mapr.process.context_switch_voluntary (count) | The number of voluntary context switches for MapR processes. Shown as process |
mapr.process.context_switch_involuntary (count) | The number of involuntary context switches for MapR processes. Shown as operation |
mapr.streams.listen_msgs (count) | The number of Streams messages read by the consumer. Shown as object |
mapr.streams.listen_rpcs (count) | The number of Streams consumer RPCs. Shown as object |
mapr.streams.listen_bytes (count) | The number of megabytes consumed by Streams messages. Shown as mebibyte |
mapr.streams.produce_msgs (count) | The number of Streams messages produced. Shown as object |
mapr.streams.produce_rpcs (count) | The number of Streams producer RPCs. Shown as object |
mapr.streams.produce_bytes (count) | The number of megabytes produced by Streams messages. Shown as mebibyte |
mapr.streams.listen_currpcs (gauge) | The number of concurrent Stream consumer RPCs. Shown as object |
mapr.rpc.bytes_recd (count) | The number of bytes received by the MapR Filesystem over RPC. Shown as byte |
mapr.rpc.bytes_sent (count) | The number of bytes sent by the MapR filesystem over RPC. Shown as byte |
mapr.rpc.calls_recd (count) | The number of RPC calls received by the MapR filesystem. Shown as message |
mapr.topology.utilization (gauge) | The aggregate percentage of CPU utilization. Shown as percent |
mapr.topology.disks_used_capacity (gauge) | The amount disk space used in gigabytes. Shown as gibibyte |
mapr.topology.disks_total_capacity (gauge) | The disk capacity in gigabytes. Shown as gibibyte |
mapr.volume.logical_used (gauge) | The number of MBs used for logical volumes before compression is applied to the files. Shown as mebibyte |
mapr.volume.snapshot_used (gauge) | The number of MBs used for snapshots. Shown as mebibyte |
mapr.volume.total_used (gauge) | The number of MB used for volumes and snapshots. Shown as mebibyte |
mapr.volume.used (gauge) | The number of MB used for volumes after compression is applied to the files. Shown as mebibyte |
mapr.volume.quota (gauge) | The number of megabytes(MB) used for volume quota. Shown as mebibyte |
mapr.volmetrics.read_throughput (gauge) | The per volume read throughput in KB Shown as kibibyte |
mapr.volmetrics.write_throughput (gauge) | The per volume write throughput in KB Shown as kibibyte |
mapr.volmetrics.read_latency (gauge) | The per volume read latency in milliseconds Shown as millisecond |
mapr.volmetrics.write_latency (gauge) | The per volume write latency in milliseconds Shown as millisecond |
mapr.volmetrics.read_ops (count) | A count of the read operations per volume Shown as operation |
mapr.volmetrics.write_ops (count) | A count of the write operations per volume Shown as operation |
MapR チェックには、イベントは含まれません。
mapr.can_connect
Returns CRITICAL
if the Agent fails to connect and subscribe to the stream topic, OK
otherwise.
Statuses: ok, critical
MapR インテグレーションを構成してから、Agent のクラッシュループ状態が続いている。
アクセス許可に問題があり、mapr-streams-python 内の C ライブラリがセグメンテーション障害を起こすケースがいくつか発生しています。dd-agent
ユーザーがチケットファイルでアクセス許可を読み込み、MAPR_TICKETFILE_LOCATION
環境変数がチケットを指定しているときに dd-agent
ユーザーが maprcli
コマンドを実行できることを確認してください。
インテグレーションは正しく動作しているように思えるが、メトリクスがまったく送信されない。
インテグレーションがトピックからデータをプルし、MapR がデータをトピックにプッシュする必要があるため、必ず Agent を最低でも数分間実行してください。
それでも問題が解決されず、sudo
を使用してAgent を手動で実行するとデータが表示される場合は、アクセス許可に問題があります。すべてを再度ご確認ください。dd-agent
Linux ユーザーは、ユーザー X (dd-agent
自身である場合とそうでない場合があります) として MapR に対してクエリを実行できるでけでなく、ローカルに保存されたチケットを使用できるはずです。さらに、ユーザー X には consume
permission on the /var/mapr/mapr.monitoring/metricstreams
ストリームが必要です。
confluent_kafka was not imported correctly ...
というメッセージが表示される。
このメッセージは、Agent 埋め込み環境で、コマンド import confluent_kafka
を実行できなかったときに表示され、mapr-streams-library が埋め込み環境内にインストールされていないか、mapr-core ライブラリが見つからないことを意味します。エラーメッセージに詳細が記述されています。
ご不明な点は、Datadog サポートまでお問い合わせください。