- 필수 기능
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- 디지털 경험
- 소프트웨어 제공
- 보안
- 로그 관리
- 관리
- 인프라스트럭처
- ci
- containers
- csm
- ndm
- otel_guides
- overview
- slos
- synthetics
- tests
- 워크플로
Supported OS
This check monitors Ignite.
The Ignite check is included in the Datadog Agent package. No additional installation is needed on your server.
JMX metrics exporter is enabled by default, but you may need to choose the port exposed, or enable authentication depending on your network security. The official docker image uses 49112
by default.
For logging, it’s strongly suggested to enable log4j to benefit from a log format with full dates.
To configure this check for an Agent running on a host:
Edit the ignite.d/conf.yaml
file, in the conf.d/
folder at the root of your Agent’s configuration directory to start collecting your ignite performance data. See the sample ignite.d/conf.yaml for all available configuration options.
This check has a limit of 350 metrics per instance. The number of returned metrics is indicated in the status page. You can specify the metrics you are interested in by editing the configuration below. To learn how to customize the metrics to collect see the JMX Checks documentation for more information. If you need to monitor more metrics, contact Datadog support.
Available for Agent versions >6.0
Collecting logs is disabled by default in the Datadog Agent, you need to enable it in datadog.yaml
:
logs_enabled: true
Add this configuration block to your ignite.d/conf.yaml
file to start collecting your Ignite logs:
logs:
- type: file
path: <IGNITE_HOME>/work/log/ignite-*.log
source: ignite
service: '<SERVICE_NAME>'
log_processing_rules:
- type: multi_line
name: new_log_start_with_date
pattern: \[\d{4}\-\d{2}\-\d{2}
Change the path
and service
parameter values and configure them for your environment. See the sample ignite.d/conf.yaml for all available configuration options.
For containerized environments, see the Autodiscovery Integration Templates for guidance on applying the parameters below.
To collect metrics with the Datadog-Ignite integration, see the Autodiscovery with JMX guide.
Available for Agent versions >6.0
Collecting logs is disabled by default in the Datadog Agent. To enable it, see Docker log collection.
Parameter | Value |
---|---|
<LOG_CONFIG> | {"source": "ignite", "service": "<SERVICE_NAME>", "log_processing_rules":{"type":"multi_line","name":"new_log_start_with_date", "pattern":"\d{4}\-\d{2}\-\d{2}"}} |
Run the Agent’s status
subcommand and look for ignite
under the Checks section.
ignite.active_baseline_nodes (gauge) | Active baseline nodes count. Shown as node |
ignite.allocation_rate (gauge) | Allocation rate (pages per second) averaged across rateTimeInternal. Shown as page |
ignite.average_cpu_load (gauge) | Average of CPU load values over all metrics kept in the history. |
ignite.busy_time_percentage (gauge) | Percentage of time this node is busy executing jobs vs. idling. Shown as percent |
ignite.cache.average_commit_time (gauge) | Average time to commit transaction. Shown as microsecond |
ignite.cache.average_get_time (gauge) | Average time to execute get. Shown as microsecond |
ignite.cache.average_put_time (gauge) | Average time to execute put. Shown as microsecond |
ignite.cache.average_remove_time (gauge) | Average time to execute remove. Shown as microsecond |
ignite.cache.average_rollback_time (gauge) | Average time to rollback transaction. Shown as microsecond |
ignite.cache.backups (gauge) | Count of backups configured for cache group. |
ignite.cache.cluster_moving_partitions (gauge) | Count of partitions for this cache group in the entire cluster with state MOVING. |
ignite.cache.cluster_owning_partitions (gauge) | Count of partitions for this cache group in the entire cluster with state OWNING. |
ignite.cache.commit_queue_size (gauge) | Transaction committed queue size. Shown as transaction |
ignite.cache.commits (rate) | Number of transaction commits. |
ignite.cache.committed_versions_size (gauge) | Transaction committed ID map size. Shown as transaction |
ignite.cache.dht_commit_queue_size (gauge) | Transaction DHT committed queue size. Shown as transaction |
ignite.cache.dht_committed_versions_size (gauge) | Transaction DHT committed ID map size. Shown as transaction |
ignite.cache.dht_prepare_queue_size (gauge) | Transaction DHT prepared queue size. Shown as transaction |
ignite.cache.dht_rolledback_versions_size (gauge) | Transaction DHT rolled back ID map size. Shown as transaction |
ignite.cache.dht_start_version_counts_size (gauge) | Transaction DHT start version counts map size. Shown as transaction |
ignite.cache.dht_thread_map_size (gauge) | Transaction DHT per-thread map size. Shown as transaction |
ignite.cache.dht_xid_map_size (gauge) | Transaction DHT per-Xid map size. Shown as transaction |
ignite.cache.entry_processor.average_invocation_time (gauge) | The mean time to execute cache invokes. Shown as microsecond |
ignite.cache.entry_processor.hit_percentage (gauge) | The percentage of invocations on keys, which exist in cache. Shown as percent |
ignite.cache.entry_processor.hits (rate) | The total number of invocations on keys, which exist in cache. |
ignite.cache.entry_processor.invocations (rate) | The total number of cache invocations. |
ignite.cache.entry_processor.maximum_invocation_time (gauge) | So far, the maximum time to execute cache invokes. Shown as microsecond |
ignite.cache.entry_processor.minimum_invocation_time (gauge) | So far, the minimum time to execute cache invokes. Shown as microsecond |
ignite.cache.entry_processor.miss_percentage (gauge) | The percentage of invocations on keys, which don't exist in cache. Shown as percent |
ignite.cache.entry_processor.misses (rate) | The total number of invocations on keys, which don't exist in cache. |
ignite.cache.entry_processor.puts (rate) | The total number of cache invocations, caused update. |
ignite.cache.entry_processor.read_only_invocations (rate) | The total number of cache invocations, caused no updates. |
ignite.cache.entry_processor.removals (rate) | The total number of cache invocations, caused removals. |
ignite.cache.estimated_rebalancing_keys (gauge) | Number estimated to rebalance keys. Shown as key |
ignite.cache.evict_queue_size (gauge) | Current size of evict queue. |
ignite.cache.evictions (rate) | Number of eviction entries. Shown as eviction |
ignite.cache.gets (rate) | The total number of gets to the cache. Shown as request |
ignite.cache.heap_entries (gauge) | Number of entries in heap memory. Shown as entry |
ignite.cache.hit_percentage (gauge) | Percentage of successful hits. Shown as percent |
ignite.cache.hits (rate) | The number of get requests that were satisfied by the cache. Shown as request |
ignite.cache.keys_to_rebalance (gauge) | Estimated number of keys to be rebalanced on current node. Shown as key |
ignite.cache.local_moving_partitions (gauge) | Count of partitions with state MOVING for this cache group located on this node. |
ignite.cache.local_owning_partitions (gauge) | Count of partitions with state OWNING for this cache group located on this node. |
ignite.cache.local_renting_entries (gauge) | Count of entries remains to evict in RENTING partitions located on this node for this cache group. |
ignite.cache.local_renting_partitions (gauge) | Count of partitions with state RENTING for this cache group located on this node. |
ignite.cache.maximum_partition_copies (gauge) | Maximum number of partition copies for all partitions of this cache group. |
ignite.cache.minimum_partition_copies (gauge) | Minimum number of partition copies for all partitions of this cache group. |
ignite.cache.miss_percentage (gauge) | Percentage of accesses that failed to find anything. Shown as percent |
ignite.cache.misses (rate) | A miss is a get request that is not satisfied. Shown as request |
ignite.cache.offheap_allocated_size (gauge) | Memory size allocated in off-heap. Shown as byte |
ignite.cache.offheap_backup_entries (gauge) | Number of backup stored in off-heap memory. |
ignite.cache.offheap_entries (gauge) | Number of entries stored in off-heap memory. Shown as entry |
ignite.cache.offheap_evictions (rate) | Number of evictions from off-heap memory. Shown as eviction |
ignite.cache.offheap_gets (rate) | Number of gets from off-heap memory. |
ignite.cache.offheap_hit_percentage (gauge) | Percentage of hits on off-heap memory. Shown as percent |
ignite.cache.offheap_hits (rate) | Number of hits on off-heap memory. Shown as hit |
ignite.cache.offheap_miss_percentage (gauge) | Percentage of misses on off-heap memory. Shown as percent |
ignite.cache.offheap_misses (rate) | Number of misses on off-heap memory. Shown as miss |
ignite.cache.offheap_primary_entries (gauge) | Number of primary entries stored in off-heap memory. Shown as entry |
ignite.cache.offheap_puts (rate) | Number of puts to off-heap memory. |
ignite.cache.offheap_removals (rate) | Number of removed entries from off-heap memory. |
ignite.cache.partitions (gauge) | Count of partitions for cache group. |
ignite.cache.prepare_queue_size (gauge) | Transaction prepared queue size. Shown as transaction |
ignite.cache.puts (rate) | The total number of puts to the cache. Shown as request |
ignite.cache.rebalance_clearing_partitions (gauge) | Number of partitions need to be cleared before actual rebalance start. |
ignite.cache.rebalanced_keys (gauge) | Number of already rebalanced keys. Shown as key |
ignite.cache.rebalancing_bytes_rate (gauge) | Estimated rebalancing speed in bytes. Shown as byte |
ignite.cache.rebalancing_keys_rate (gauge) | Estimated rebalancing speed in keys. Shown as operation |
ignite.cache.rebalancing_partitions (gauge) | Number of currently rebalancing partitions on current node. |
ignite.cache.removals (rate) | The total number of removals from the cache. |
ignite.cache.rollbacks (rate) | Number of transaction rollback. |
ignite.cache.rolledback_versions_size (gauge) | Transaction rolled back ID map size. Shown as transaction |
ignite.cache.size (gauge) | Number of non-null values in the cache as a long value. |
ignite.cache.start_version_counts_size (gauge) | Transaction start version counts map size. Shown as transaction |
ignite.cache.thread_map_size (gauge) | Transaction per-thread map size. Shown as transaction |
ignite.cache.total_partitions (gauge) | Total number of partitions on current node. |
ignite.cache.write_behind_buffer_size (gauge) | Count of cache entries that are waiting to be flushed. |
ignite.cache.write_behind_overflow (gauge) | Count of write buffer overflow events in progress at the moment. Shown as event |
ignite.cache.write_behind_overflow_total (rate) | Count of cache overflow events since write-behind cache has started. Shown as event |
ignite.cache.write_behind_retries (gauge) | Count of cache entries that are currently in retry state. |
ignite.cache.write_behind_store_batch_size (gauge) | Maximum size of batch for similar operations. |
ignite.cache.xid_map_size (gauge) | Transaction per-Xid map size. Shown as transaction |
ignite.check_point_buffer_size (gauge) | Total size in bytes for checkpoint buffer. Shown as byte |
ignite.checkpoint.last_copied_on_write_pages (gauge) | Number of pages copied to a temporary checkpoint buffer during the last checkpoint. Shown as page |
ignite.checkpoint.last_data_pages (gauge) | Total number of data pages written during the last checkpoint. Shown as page |
ignite.checkpoint.last_duration (gauge) | Duration of the last checkpoint in milliseconds. Shown as second |
ignite.checkpoint.last_fsync_duration (gauge) | Duration of the sync phase of the last checkpoint in milliseconds. Shown as millisecond |
ignite.checkpoint.last_lock_wait_duration (gauge) | Duration of the checkpoint lock wait in milliseconds. Shown as millisecond |
ignite.checkpoint.last_mark_duration (gauge) | Duration of the checkpoint mark in milliseconds. Shown as millisecond |
ignite.checkpoint.last_pages_write_duration (gauge) | Duration of the checkpoint pages write in milliseconds. Shown as millisecond |
ignite.checkpoint.last_total_pages (gauge) | Total number of pages written during the last checkpoint. Shown as page |
ignite.checkpoint.total_time (gauge) | Total checkpoint time from last restart. Shown as second |
ignite.current_cpu_load (gauge) | The system load average; or a negative value if not available. Shown as byte |
ignite.current_daemon_thread_count (gauge) | Current number of live daemon threads. Shown as thread |
ignite.current_gc_load (gauge) | Average time spent in GC since the last update. Shown as time |
ignite.current_idle_time (gauge) | Time this node spend idling since executing last job. Shown as second |
ignite.current_thread_count (gauge) | Current number of live threads. Shown as thread |
ignite.dirty_pages (gauge) | Number of pages in memory not yet synchronized with persistent storage. Shown as page |
ignite.discovery.average_message_processing_time (gauge) | Avg message processing time. Shown as second |
ignite.discovery.max_message_processing_time (gauge) | Max message processing time. Shown as second |
ignite.discovery.message_worker_queue_size (gauge) | Message worker queue current size. |
ignite.discovery.nodes_failed (rate) | Nodes failed count. Shown as node |
ignite.discovery.nodes_joined (rate) | Nodes joined count. Shown as node |
ignite.discovery.nodes_left (rate) | Nodes left count. Shown as node |
ignite.discovery.pending_messages_discarded (gauge) | Pending messages discarded. Shown as message |
ignite.discovery.pending_messages_registered (gauge) | Pending messages registered. Shown as message |
ignite.discovery.total_processed_messages (rate) | Total processed messages count. Shown as message |
ignite.discovery.total_received_messages (rate) | Total received messages count. Shown as message |
ignite.eviction_rate (gauge) | Eviction rate (pages per second). Shown as page |
ignite.heap_memory_committed (gauge) | The amount of committed memory in bytes. Shown as byte |
ignite.heap_memory_initialized (gauge) | The initial size of memory in bytes; -1 if undefined. Shown as byte |
ignite.heap_memory_maximum (gauge) | The maximum amount of memory in bytes; -1 if undefined. Shown as byte |
ignite.heap_memory_total (gauge) | The total amount of memory in bytes; -1 if undefined. Shown as byte |
ignite.heap_memory_used (gauge) | Current heap size that is used for object allocation. Shown as byte |
ignite.idle_time_percentage (gauge) | Percentage of time this node is idling vs. executing jobs. Shown as percent |
ignite.initial_memory_size (gauge) | Initial memory region size defined by its data region. Shown as byte |
ignite.jobs.active.average (gauge) | Average number of active jobs concurrently executing on the node. Shown as job |
ignite.jobs.active.current (gauge) | Number of currently active jobs concurrently executing on the node. Shown as job |
ignite.jobs.active.maximum (gauge) | Maximum number of jobs that ever ran concurrently on this node. Shown as job |
ignite.jobs.cancelled.average (gauge) | Average number of cancelled jobs this node ever had running concurrently. Shown as job |
ignite.jobs.cancelled.current (gauge) | Number of cancelled jobs that are still running. Shown as job |
ignite.jobs.cancelled.maximum (gauge) | Maximum number of cancelled jobs this node ever had running concurrently. Shown as job |
ignite.jobs.cancelled.total (rate) | Total number of cancelled jobs since node startup. Shown as job |
ignite.jobs.execute_time.average (gauge) | Average time a job takes to execute on the node. Shown as second |
ignite.jobs.execute_time.current (gauge) | Longest time a current job has been executing for. Shown as second |
ignite.jobs.execute_time.maximum (gauge) | Time it took to execute the longest job on the node. Shown as second |
ignite.jobs.executed.total (rate) | Total number of jobs handled by the node. Shown as job |
ignite.jobs.execution_time.total (rate) | Total time all finished jobs takes to execute on the node. Shown as second |
ignite.jobs.maximum_failover (gauge) | Maximum number of attempts to execute a failed job on another node. Shown as attempt |
ignite.jobs.rejected.average (gauge) | Average number of jobs this node rejects during collision resolution operations. Shown as job |
ignite.jobs.rejected.current (gauge) | Number of jobs rejected after more recent collision resolution operation. Shown as job |
ignite.jobs.rejected.maximum (gauge) | Maximum number of jobs rejected at once during a single collision resolution operation. Shown as job |
ignite.jobs.rejected.total (rate) | Total number of jobs this node rejects during collision resolution operations since node startup. Shown as job |
ignite.jobs.total_failover (rate) | Total number of jobs that were failed over. Shown as job |
ignite.jobs.wait_time.average (gauge) | Average time jobs spend waiting in the queue to be executed. Shown as second |
ignite.jobs.wait_time.current (gauge) | Current wait time of oldest job. Shown as second |
ignite.jobs.wait_time.maximum (gauge) | Maximum time a job ever spent waiting in a queue to be executed. Shown as second |
ignite.jobs.waiting.average (gauge) | Average number of waiting jobs this node had queued. Shown as job |
ignite.jobs.waiting.current (gauge) | Number of queued jobs currently waiting to be executed. Shown as job |
ignite.jobs.waiting.maximum (gauge) | Maximum number of waiting jobs this node had. Shown as job |
ignite.large_entries_pages_percentage (gauge) | Percentage of pages that are fully occupied by large entries that go beyond page size. Shown as percent |
ignite.max_memory_size (gauge) | Maximum memory region size defined by its data region. Shown as byte |
ignite.maximum_thread_count (gauge) | The peak live thread count. Shown as thread |
ignite.non_heap_memory_committed (gauge) | Amount of non-heap memory in bytes that is committed for the JVM to use. Shown as byte |
ignite.non_heap_memory_initialized (gauge) | The initial size of non-heap memory in bytes; -1 if undefined. Shown as byte |
ignite.non_heap_memory_maximum (gauge) | Maximum amount of non-heap memory in bytes that can be used for memory management. -1 if undefined. Shown as byte |
ignite.non_heap_memory_total (gauge) | Total amount of non-heap memory in bytes that can be used for memory management. -1 if undefined. Shown as byte |
ignite.non_heap_memory_used (gauge) | Current non-heap memory size that is used by Java VM. Shown as byte |
ignite.offheap_size (gauge) | Offheap size in bytes. Shown as byte |
ignite.offheap_used_size (gauge) | Total used offheap size in bytes. Shown as byte |
ignite.oubound_messages_queue_size (gauge) | Outbound messages queue size. Shown as message |
ignite.pages_fill_factor (gauge) | The percentage of the used space. Shown as percent |
ignite.pages_read (rate) | Number of pages read from last restart. Shown as page |
ignite.pages_replace_age (gauge) | Average age at which pages in memory are replaced with pages from persistent storage (milliseconds). Shown as page |
ignite.pages_replace_rate (gauge) | Rate at which pages in memory are replaced with pages from persistent storage (pages per second). Shown as page |
ignite.pages_replaced (rate) | Number of pages replaced from last restart. Shown as page |
ignite.pages_written (rate) | Number of pages written from last restart. Shown as page |
ignite.physical_memory_pages (gauge) | Number of pages residing in physical RAM. Shown as page |
ignite.received_bytes (rate) | Received bytes count. Shown as byte |
ignite.received_messages (rate) | Received messages count. Shown as message |
ignite.sent_bytes (rate) | Sent bytes count. Shown as byte |
ignite.sent_messages (rate) | Sent messages count. Shown as message |
ignite.threads.active (gauge) | Approximate number of threads that are actively executing tasks. Shown as thread |
ignite.threads.completed_tasks (rate) | Approximate total number of tasks that have completed execution. Shown as task |
ignite.threads.core_pool_size (gauge) | The core number of threads. Shown as thread |
ignite.threads.largest_size (gauge) | Largest number of threads that have ever simultaneously been in the pool. Shown as thread |
ignite.threads.maximum_pool_size (gauge) | The maximum allowed number of threads. Shown as thread |
ignite.threads.pool_size (gauge) | Current number of threads in the pool. Shown as thread |
ignite.threads.queue_size (gauge) | Current number of threads in the pool Shown as thread |
ignite.threads.tasks (rate) | Approximate total number of tasks that have been scheduled for execution. Shown as task |
ignite.total_allocated_pages (gauge) | Total number of allocated pages. Shown as page |
ignite.total_allocated_size (gauge) | Total size of memory allocated in bytes. Shown as byte |
ignite.total_baseline_nodes (gauge) | Total baseline nodes count. Shown as node |
ignite.total_busy_time (gauge) | Total time this node spent executing jobs. Shown as second |
ignite.total_client_nodes (gauge) | Client nodes count. Shown as node |
ignite.total_cpus (gauge) | The number of CPUs available to the Java Virtual Machine. Shown as core |
ignite.total_executed_tasks (rate) | Total number of tasks handled by the node. Shown as task |
ignite.total_idle_time (gauge) | Total time this node spent idling (not executing any jobs). Shown as second |
ignite.total_nodes (gauge) | Total number of nodes. Shown as node |
ignite.total_server_nodes (gauge) | Server nodes count. Shown as node |
ignite.total_started_threads (rate) | The total number of threads started. Shown as thread |
ignite.transaction.committed (rate) | The number of transactions which were committed. Shown as transaction |
ignite.transaction.holding_lock (gauge) | The number of active transactions holding at least one key lock. Shown as transaction |
ignite.transaction.locked_keys (gauge) | The number of keys locked on the node. Shown as key |
ignite.transaction.owner (gauge) | The number of active transactions for which this node is the initiator. Shown as transaction |
ignite.transaction.rolledback (rate) | The number of transactions which were rollback. Shown as transaction |
ignite.used_checkpoint_buffer_pages (gauge) | Used checkpoint buffer size in pages. Shown as page |
ignite.used_checkpoint_buffer_size (gauge) | Used checkpoint buffer size in bytes. Shown as byte |
ignite.wal.archive_segments (gauge) | Current number of WAL segments in the WAL archive. Shown as segment |
ignite.wal.buffer_poll_spin (gauge) | WAL buffer poll spins number over the last time interval. |
ignite.wal.fsync_average (gauge) | Average WAL fsync duration in microseconds over the last time interval. Shown as microsecond |
ignite.wal.last_rollover (gauge) | Time of the last WAL segment rollover. Shown as second |
ignite.wal.logging_rate (gauge) | Average number of WAL records per second written during the last time interval. Shown as record |
ignite.wal.total_size (gauge) | Total size in bytes for storage wal files. Shown as byte |
ignite.wal.writing_rate (gauge) | Average number of bytes per second written during the last time interval. Shown as byte |
The Ignite integration does not include any events.
ignite.can_connect
Returns CRITICAL
if the Agent is unable to connect to and collect metrics from the monitored Ignite instance, WARNING
if no metrics are collected, and OK
otherwise.
Statuses: ok, critical, warning
Need help? Contact Datadog support.