- 필수 기능
- 시작하기
- Glossary
- 표준 속성
- Guides
- Agent
- 통합
- 개방형텔레메트리
- 개발자
- Administrator's Guide
- API
- Datadog Mobile App
- CoScreen
- Cloudcraft
- 앱 내
- 서비스 관리
- 인프라스트럭처
- 애플리케이션 성능
- APM
- Continuous Profiler
- 스팬 시각화
- 데이터 스트림 모니터링
- 데이터 작업 모니터링
- 디지털 경험
- 소프트웨어 제공
- 보안
- AI Observability
- 로그 관리
- 관리
Supported OS
이 점검에서는 Ignite를 모니터링합니다.
Ignite 점검은 Datadog 에이전트 패키지에 포함되어 있습니다. 서버에 추가 설치를 할 필요가 없습니다.
기본적으로 JMX 메트릭 내보내기가 활성화되어 있지만 노출할 포트를 선택해야 하거나 내 네트워크 보안에 따라 인증을 활성화해야 합니다. 공식 docker 이미지에서는 기본적으로 49112
를 사용합니다.
로깅하려면 log4j를 활성화해 전체 날짜가 있는 로그를 사용해 혜택을 극대화하는 것이 좋습니다.
호스트에서 실행 중인 에이전트에 이 점검을 구성하는 방법:
에이전트 구성 디렉터리의 루트에 있는 conf.d/
폴더에서 ignite.d/conf.yaml
파일을 편집해 ignite 성능 데이터 수집을 시작하세요. 사용할 수 있는 설정 옵션 전체를 보려면 샘플 ignite.d/conf.yaml을 참고하세요.
이 점검의 제한 값은 인스턴스당 메트릭 350개입니다. 반환된 메트릭 개수는 상태 출력에 표시됩니다. 아래 구성을 편집해 관심 있는 메트릭을 지정할 수 있습니다. 수집할 메트릭을 사용자 지정하는 방법을 배우려면 JMX 점검 설명서를 참고하세요. 더 많은 메트릭을 모니터링해야 하는 경우 Datadog 지원팀에 문의하세요.
에이전트 버전 > 6.0에서 사용 가능
Datadog 에이전트에서는 로그 수집이 기본적으로 비활성화되어 있습니다. datadog.yaml
파일에서 활성화해야 합니다.
logs_enabled: true
Ignote 로그 수집을 시작하려면 이 구성 블록을 ignite.d/conf.yaml
파일에 추가하세요.
logs:
- type: file
path: <IGNITE_HOME>/work/log/ignite-*.log
source: ignite
service: '<SERVICE_NAME>'
log_processing_rules:
- type: multi_line
name: new_log_start_with_date
pattern: \[\d{4}\-\d{2}\-\d{2}
path
와 service
파라미터 값을 내 환경에 맞게 변경하세요. 사용할 수 있는 구성 옵션 전체를 보려면 샘플 ignite.d/conf.yaml을 참고하세요.
컨테이너화된 환경의 경우 자동탐지 통합 템플릿에 다음 파라미터를 적용하는 방법이 안내되어 있습니다.
Datadog-Ignite 통합으로 메트릭을 수집하려면 JMX로 자동탐지 가이드를 참고하세요.
에이전트 버전 > 6.0에서 사용 가능
기본적으로 로그 수집은 Datadog 에이전트에서 비활성화되어 있습니다. 활성화하려면 Docker 로그 수집을 참고하세요.
파라미터 | 값 |
---|---|
<LOG_CONFIG> | {"source": "ignite", "service": "<SERVICE_NAME>", "log_processing_rules":{"type":"multi_line","name":"new_log_start_with_date", "pattern":"\d{4}\-\d{2}\-\d{2}"}} |
에이전트의 status
하위 명령을 실행하고 Checks 섹션 아래에서 ignite
를 찾으세요.
ignite.active_baseline_nodes (gauge) | Active baseline nodes count. Shown as node |
ignite.allocation_rate (gauge) | Allocation rate (pages per second) averaged across rateTimeInternal. Shown as page |
ignite.average_cpu_load (gauge) | Average of CPU load values over all metrics kept in the history. |
ignite.busy_time_percentage (gauge) | Percentage of time this node is busy executing jobs vs. idling. Shown as percent |
ignite.cache.average_commit_time (gauge) | Average time to commit transaction. Shown as microsecond |
ignite.cache.average_get_time (gauge) | Average time to execute get. Shown as microsecond |
ignite.cache.average_put_time (gauge) | Average time to execute put. Shown as microsecond |
ignite.cache.average_remove_time (gauge) | Average time to execute remove. Shown as microsecond |
ignite.cache.average_rollback_time (gauge) | Average time to rollback transaction. Shown as microsecond |
ignite.cache.backups (gauge) | Count of backups configured for cache group. |
ignite.cache.cluster_moving_partitions (gauge) | Count of partitions for this cache group in the entire cluster with state MOVING. |
ignite.cache.cluster_owning_partitions (gauge) | Count of partitions for this cache group in the entire cluster with state OWNING. |
ignite.cache.commit_queue_size (gauge) | Transaction committed queue size. Shown as transaction |
ignite.cache.commits (rate) | Number of transaction commits. |
ignite.cache.committed_versions_size (gauge) | Transaction committed ID map size. Shown as transaction |
ignite.cache.dht_commit_queue_size (gauge) | Transaction DHT committed queue size. Shown as transaction |
ignite.cache.dht_committed_versions_size (gauge) | Transaction DHT committed ID map size. Shown as transaction |
ignite.cache.dht_prepare_queue_size (gauge) | Transaction DHT prepared queue size. Shown as transaction |
ignite.cache.dht_rolledback_versions_size (gauge) | Transaction DHT rolled back ID map size. Shown as transaction |
ignite.cache.dht_start_version_counts_size (gauge) | Transaction DHT start version counts map size. Shown as transaction |
ignite.cache.dht_thread_map_size (gauge) | Transaction DHT per-thread map size. Shown as transaction |
ignite.cache.dht_xid_map_size (gauge) | Transaction DHT per-Xid map size. Shown as transaction |
ignite.cache.entry_processor.average_invocation_time (gauge) | The mean time to execute cache invokes. Shown as microsecond |
ignite.cache.entry_processor.hit_percentage (gauge) | The percentage of invocations on keys, which exist in cache. Shown as percent |
ignite.cache.entry_processor.hits (rate) | The total number of invocations on keys, which exist in cache. |
ignite.cache.entry_processor.invocations (rate) | The total number of cache invocations. |
ignite.cache.entry_processor.maximum_invocation_time (gauge) | So far, the maximum time to execute cache invokes. Shown as microsecond |
ignite.cache.entry_processor.minimum_invocation_time (gauge) | So far, the minimum time to execute cache invokes. Shown as microsecond |
ignite.cache.entry_processor.miss_percentage (gauge) | The percentage of invocations on keys, which don't exist in cache. Shown as percent |
ignite.cache.entry_processor.misses (rate) | The total number of invocations on keys, which don't exist in cache. |
ignite.cache.entry_processor.puts (rate) | The total number of cache invocations, caused update. |
ignite.cache.entry_processor.read_only_invocations (rate) | The total number of cache invocations, caused no updates. |
ignite.cache.entry_processor.removals (rate) | The total number of cache invocations, caused removals. |
ignite.cache.estimated_rebalancing_keys (gauge) | Number estimated to rebalance keys. Shown as key |
ignite.cache.evict_queue_size (gauge) | Current size of evict queue. |
ignite.cache.evictions (rate) | Number of eviction entries. Shown as eviction |
ignite.cache.gets (rate) | The total number of gets to the cache. Shown as request |
ignite.cache.heap_entries (gauge) | Number of entries in heap memory. Shown as entry |
ignite.cache.hit_percentage (gauge) | Percentage of successful hits. Shown as percent |
ignite.cache.hits (rate) | The number of get requests that were satisfied by the cache. Shown as request |
ignite.cache.keys_to_rebalance (gauge) | Estimated number of keys to be rebalanced on current node. Shown as key |
ignite.cache.local_moving_partitions (gauge) | Count of partitions with state MOVING for this cache group located on this node. |
ignite.cache.local_owning_partitions (gauge) | Count of partitions with state OWNING for this cache group located on this node. |
ignite.cache.local_renting_entries (gauge) | Count of entries remains to evict in RENTING partitions located on this node for this cache group. |
ignite.cache.local_renting_partitions (gauge) | Count of partitions with state RENTING for this cache group located on this node. |
ignite.cache.maximum_partition_copies (gauge) | Maximum number of partition copies for all partitions of this cache group. |
ignite.cache.minimum_partition_copies (gauge) | Minimum number of partition copies for all partitions of this cache group. |
ignite.cache.miss_percentage (gauge) | Percentage of accesses that failed to find anything. Shown as percent |
ignite.cache.misses (rate) | A miss is a get request that is not satisfied. Shown as request |
ignite.cache.offheap_allocated_size (gauge) | Memory size allocated in off-heap. Shown as byte |
ignite.cache.offheap_backup_entries (gauge) | Number of backup stored in off-heap memory. |
ignite.cache.offheap_entries (gauge) | Number of entries stored in off-heap memory. Shown as entry |
ignite.cache.offheap_evictions (rate) | Number of evictions from off-heap memory. Shown as eviction |
ignite.cache.offheap_gets (rate) | Number of gets from off-heap memory. |
ignite.cache.offheap_hit_percentage (gauge) | Percentage of hits on off-heap memory. Shown as percent |
ignite.cache.offheap_hits (rate) | Number of hits on off-heap memory. Shown as hit |
ignite.cache.offheap_miss_percentage (gauge) | Percentage of misses on off-heap memory. Shown as percent |
ignite.cache.offheap_misses (rate) | Number of misses on off-heap memory. Shown as miss |
ignite.cache.offheap_primary_entries (gauge) | Number of primary entries stored in off-heap memory. Shown as entry |
ignite.cache.offheap_puts (rate) | Number of puts to off-heap memory. |
ignite.cache.offheap_removals (rate) | Number of removed entries from off-heap memory. |
ignite.cache.partitions (gauge) | Count of partitions for cache group. |
ignite.cache.prepare_queue_size (gauge) | Transaction prepared queue size. Shown as transaction |
ignite.cache.puts (rate) | The total number of puts to the cache. Shown as request |
ignite.cache.rebalance_clearing_partitions (gauge) | Number of partitions need to be cleared before actual rebalance start. |
ignite.cache.rebalanced_keys (gauge) | Number of already rebalanced keys. Shown as key |
ignite.cache.rebalancing_bytes_rate (gauge) | Estimated rebalancing speed in bytes. Shown as byte |
ignite.cache.rebalancing_keys_rate (gauge) | Estimated rebalancing speed in keys. Shown as operation |
ignite.cache.rebalancing_partitions (gauge) | Number of currently rebalancing partitions on current node. |
ignite.cache.removals (rate) | The total number of removals from the cache. |
ignite.cache.rollbacks (rate) | Number of transaction rollback. |
ignite.cache.rolledback_versions_size (gauge) | Transaction rolled back ID map size. Shown as transaction |
ignite.cache.size (gauge) | Number of non-null values in the cache as a long value. |
ignite.cache.start_version_counts_size (gauge) | Transaction start version counts map size. Shown as transaction |
ignite.cache.thread_map_size (gauge) | Transaction per-thread map size. Shown as transaction |
ignite.cache.total_partitions (gauge) | Total number of partitions on current node. |
ignite.cache.write_behind_buffer_size (gauge) | Count of cache entries that are waiting to be flushed. |
ignite.cache.write_behind_overflow (gauge) | Count of write buffer overflow events in progress at the moment. Shown as event |
ignite.cache.write_behind_overflow_total (rate) | Count of cache overflow events since write-behind cache has started. Shown as event |
ignite.cache.write_behind_retries (gauge) | Count of cache entries that are currently in retry state. |
ignite.cache.write_behind_store_batch_size (gauge) | Maximum size of batch for similar operations. |
ignite.cache.xid_map_size (gauge) | Transaction per-Xid map size. Shown as transaction |
ignite.check_point_buffer_size (gauge) | Total size in bytes for checkpoint buffer. Shown as byte |
ignite.checkpoint.last_copied_on_write_pages (gauge) | Number of pages copied to a temporary checkpoint buffer during the last checkpoint. Shown as page |
ignite.checkpoint.last_data_pages (gauge) | Total number of data pages written during the last checkpoint. Shown as page |
ignite.checkpoint.last_duration (gauge) | Duration of the last checkpoint in milliseconds. Shown as second |
ignite.checkpoint.last_fsync_duration (gauge) | Duration of the sync phase of the last checkpoint in milliseconds. Shown as millisecond |
ignite.checkpoint.last_lock_wait_duration (gauge) | Duration of the checkpoint lock wait in milliseconds. Shown as millisecond |
ignite.checkpoint.last_mark_duration (gauge) | Duration of the checkpoint mark in milliseconds. Shown as millisecond |
ignite.checkpoint.last_pages_write_duration (gauge) | Duration of the checkpoint pages write in milliseconds. Shown as millisecond |
ignite.checkpoint.last_total_pages (gauge) | Total number of pages written during the last checkpoint. Shown as page |
ignite.checkpoint.total_time (gauge) | Total checkpoint time from last restart. Shown as second |
ignite.current_cpu_load (gauge) | The system load average; or a negative value if not available. Shown as byte |
ignite.current_daemon_thread_count (gauge) | Current number of live daemon threads. Shown as thread |
ignite.current_gc_load (gauge) | Average time spent in GC since the last update. Shown as time |
ignite.current_idle_time (gauge) | Time this node spend idling since executing last job. Shown as second |
ignite.current_thread_count (gauge) | Current number of live threads. Shown as thread |
ignite.dirty_pages (gauge) | Number of pages in memory not yet synchronized with persistent storage. Shown as page |
ignite.discovery.average_message_processing_time (gauge) | Avg message processing time. Shown as second |
ignite.discovery.max_message_processing_time (gauge) | Max message processing time. Shown as second |
ignite.discovery.message_worker_queue_size (gauge) | Message worker queue current size. |
ignite.discovery.nodes_failed (rate) | Nodes failed count. Shown as node |
ignite.discovery.nodes_joined (rate) | Nodes joined count. Shown as node |
ignite.discovery.nodes_left (rate) | Nodes left count. Shown as node |
ignite.discovery.pending_messages_discarded (gauge) | Pending messages discarded. Shown as message |
ignite.discovery.pending_messages_registered (gauge) | Pending messages registered. Shown as message |
ignite.discovery.total_processed_messages (rate) | Total processed messages count. Shown as message |
ignite.discovery.total_received_messages (rate) | Total received messages count. Shown as message |
ignite.eviction_rate (gauge) | Eviction rate (pages per second). Shown as page |
ignite.heap_memory_committed (gauge) | The amount of committed memory in bytes. Shown as byte |
ignite.heap_memory_initialized (gauge) | The initial size of memory in bytes; -1 if undefined. Shown as byte |
ignite.heap_memory_maximum (gauge) | The maximum amount of memory in bytes; -1 if undefined. Shown as byte |
ignite.heap_memory_total (gauge) | The total amount of memory in bytes; -1 if undefined. Shown as byte |
ignite.heap_memory_used (gauge) | Current heap size that is used for object allocation. Shown as byte |
ignite.idle_time_percentage (gauge) | Percentage of time this node is idling vs. executing jobs. Shown as percent |
ignite.initial_memory_size (gauge) | Initial memory region size defined by its data region. Shown as byte |
ignite.jobs.active.average (gauge) | Average number of active jobs concurrently executing on the node. Shown as job |
ignite.jobs.active.current (gauge) | Number of currently active jobs concurrently executing on the node. Shown as job |
ignite.jobs.active.maximum (gauge) | Maximum number of jobs that ever ran concurrently on this node. Shown as job |
ignite.jobs.cancelled.average (gauge) | Average number of cancelled jobs this node ever had running concurrently. Shown as job |
ignite.jobs.cancelled.current (gauge) | Number of cancelled jobs that are still running. Shown as job |
ignite.jobs.cancelled.maximum (gauge) | Maximum number of cancelled jobs this node ever had running concurrently. Shown as job |
ignite.jobs.cancelled.total (rate) | Total number of cancelled jobs since node startup. Shown as job |
ignite.jobs.execute_time.average (gauge) | Average time a job takes to execute on the node. Shown as second |
ignite.jobs.execute_time.current (gauge) | Longest time a current job has been executing for. Shown as second |
ignite.jobs.execute_time.maximum (gauge) | Time it took to execute the longest job on the node. Shown as second |
ignite.jobs.executed.total (rate) | Total number of jobs handled by the node. Shown as job |
ignite.jobs.execution_time.total (rate) | Total time all finished jobs takes to execute on the node. Shown as second |
ignite.jobs.maximum_failover (gauge) | Maximum number of attempts to execute a failed job on another node. Shown as attempt |
ignite.jobs.rejected.average (gauge) | Average number of jobs this node rejects during collision resolution operations. Shown as job |
ignite.jobs.rejected.current (gauge) | Number of jobs rejected after more recent collision resolution operation. Shown as job |
ignite.jobs.rejected.maximum (gauge) | Maximum number of jobs rejected at once during a single collision resolution operation. Shown as job |
ignite.jobs.rejected.total (rate) | Total number of jobs this node rejects during collision resolution operations since node startup. Shown as job |
ignite.jobs.total_failover (rate) | Total number of jobs that were failed over. Shown as job |
ignite.jobs.wait_time.average (gauge) | Average time jobs spend waiting in the queue to be executed. Shown as second |
ignite.jobs.wait_time.current (gauge) | Current wait time of oldest job. Shown as second |
ignite.jobs.wait_time.maximum (gauge) | Maximum time a job ever spent waiting in a queue to be executed. Shown as second |
ignite.jobs.waiting.average (gauge) | Average number of waiting jobs this node had queued. Shown as job |
ignite.jobs.waiting.current (gauge) | Number of queued jobs currently waiting to be executed. Shown as job |
ignite.jobs.waiting.maximum (gauge) | Maximum number of waiting jobs this node had. Shown as job |
ignite.large_entries_pages_percentage (gauge) | Percentage of pages that are fully occupied by large entries that go beyond page size. Shown as percent |
ignite.max_memory_size (gauge) | Maximum memory region size defined by its data region. Shown as byte |
ignite.maximum_thread_count (gauge) | The peak live thread count. Shown as thread |
ignite.non_heap_memory_committed (gauge) | Amount of non-heap memory in bytes that is committed for the JVM to use. Shown as byte |
ignite.non_heap_memory_initialized (gauge) | The initial size of non-heap memory in bytes; -1 if undefined. Shown as byte |
ignite.non_heap_memory_maximum (gauge) | Maximum amount of non-heap memory in bytes that can be used for memory management. -1 if undefined. Shown as byte |
ignite.non_heap_memory_total (gauge) | Total amount of non-heap memory in bytes that can be used for memory management. -1 if undefined. Shown as byte |
ignite.non_heap_memory_used (gauge) | Current non-heap memory size that is used by Java VM. Shown as byte |
ignite.offheap_size (gauge) | Offheap size in bytes. Shown as byte |
ignite.offheap_used_size (gauge) | Total used offheap size in bytes. Shown as byte |
ignite.oubound_messages_queue_size (gauge) | Outbound messages queue size. Shown as message |
ignite.pages_fill_factor (gauge) | The percentage of the used space. Shown as percent |
ignite.pages_read (rate) | Number of pages read from last restart. Shown as page |
ignite.pages_replace_age (gauge) | Average age at which pages in memory are replaced with pages from persistent storage (milliseconds). Shown as page |
ignite.pages_replace_rate (gauge) | Rate at which pages in memory are replaced with pages from persistent storage (pages per second). Shown as page |
ignite.pages_replaced (rate) | Number of pages replaced from last restart. Shown as page |
ignite.pages_written (rate) | Number of pages written from last restart. Shown as page |
ignite.physical_memory_pages (gauge) | Number of pages residing in physical RAM. Shown as page |
ignite.received_bytes (rate) | Received bytes count. Shown as byte |
ignite.received_messages (rate) | Received messages count. Shown as message |
ignite.sent_bytes (rate) | Sent bytes count. Shown as byte |
ignite.sent_messages (rate) | Sent messages count. Shown as message |
ignite.threads.active (gauge) | Approximate number of threads that are actively executing tasks. Shown as thread |
ignite.threads.completed_tasks (rate) | Approximate total number of tasks that have completed execution. Shown as task |
ignite.threads.core_pool_size (gauge) | The core number of threads. Shown as thread |
ignite.threads.largest_size (gauge) | Largest number of threads that have ever simultaneously been in the pool. Shown as thread |
ignite.threads.maximum_pool_size (gauge) | The maximum allowed number of threads. Shown as thread |
ignite.threads.pool_size (gauge) | Current number of threads in the pool. Shown as thread |
ignite.threads.queue_size (gauge) | Current number of threads in the pool Shown as thread |
ignite.threads.tasks (rate) | Approximate total number of tasks that have been scheduled for execution. Shown as task |
ignite.total_allocated_pages (gauge) | Total number of allocated pages. Shown as page |
ignite.total_allocated_size (gauge) | Total size of memory allocated in bytes. Shown as byte |
ignite.total_baseline_nodes (gauge) | Total baseline nodes count. Shown as node |
ignite.total_busy_time (gauge) | Total time this node spent executing jobs. Shown as second |
ignite.total_client_nodes (gauge) | Client nodes count. Shown as node |
ignite.total_cpus (gauge) | The number of CPUs available to the Java Virtual Machine. Shown as core |
ignite.total_executed_tasks (rate) | Total number of tasks handled by the node. Shown as task |
ignite.total_idle_time (gauge) | Total time this node spent idling (not executing any jobs). Shown as second |
ignite.total_nodes (gauge) | Total number of nodes. Shown as node |
ignite.total_server_nodes (gauge) | Server nodes count. Shown as node |
ignite.total_started_threads (rate) | The total number of threads started. Shown as thread |
ignite.transaction.committed (rate) | The number of transactions which were committed. Shown as transaction |
ignite.transaction.holding_lock (gauge) | The number of active transactions holding at least one key lock. Shown as transaction |
ignite.transaction.locked_keys (gauge) | The number of keys locked on the node. Shown as key |
ignite.transaction.owner (gauge) | The number of active transactions for which this node is the initiator. Shown as transaction |
ignite.transaction.rolledback (rate) | The number of transactions which were rollback. Shown as transaction |
ignite.used_checkpoint_buffer_pages (gauge) | Used checkpoint buffer size in pages. Shown as page |
ignite.used_checkpoint_buffer_size (gauge) | Used checkpoint buffer size in bytes. Shown as byte |
ignite.wal.archive_segments (gauge) | Current number of WAL segments in the WAL archive. Shown as segment |
ignite.wal.buffer_poll_spin (gauge) | WAL buffer poll spins number over the last time interval. |
ignite.wal.fsync_average (gauge) | Average WAL fsync duration in microseconds over the last time interval. Shown as microsecond |
ignite.wal.last_rollover (gauge) | Time of the last WAL segment rollover. Shown as second |
ignite.wal.logging_rate (gauge) | Average number of WAL records per second written during the last time interval. Shown as record |
ignite.wal.total_size (gauge) | Total size in bytes for storage wal files. Shown as byte |
ignite.wal.writing_rate (gauge) | Average number of bytes per second written during the last time interval. Shown as byte |
Ignite 통합에는 이벤트가 포함되어 있지 않습니다.
ignite.can_connect
Returns CRITICAL
if the Agent is unable to connect to and collect metrics from the monitored Ignite instance, WARNING
if no metrics are collected, and OK
otherwise.
Statuses: ok, critical, warning
도움이 필요하신가요? Datadog 지원팀에 문의하세요.