Hazelcast
New announcements from Dash: Incident Management, Continuous Profiler, and more! New announcements from Dash!

Hazelcast

Agent Check Agent Check

Supported OS: Linux Mac OS Windows

Overview

This check monitors Hazelcast.

Setup

Installation

The Hazelcast check is included in the Datadog Agent package. No additional installation is needed on your server.

Configuration

Host

To configure this check for an Agent running on a host:

Metric collection
  1. Edit the hazelcast.d/conf.yaml file, in the conf.d/ folder at the root of your Agent’s configuration directory to start collecting your Hazelcast performance data. See the sample hazelcast.d/conf.yaml for all available configuration options.

This check has a limit of 350 metrics per instance. The number of returned metrics is indicated in the info page. You can specify the metrics you are interested in by editing the configuration below. To learn how to customize the metrics to collect, visit the JMX Checks documentation for more detailed instructions. If you need to monitor more metrics, contact Datadog support.

  1. Restart the Agent.
Log collection
  1. Hazelcast supports many different logging adapters. Here is an example of a log4j2.properties file:

    rootLogger=file
    rootLogger.level=info
    property.filepath=/path/to/log/files
    property.filename=hazelcast
    
    appender.file.type=RollingFile
    appender.file.name=RollingFile
    appender.file.fileName=${filepath}/${filename}.log
    appender.file.filePattern=${filepath}/${filename}-%d{yyyy-MM-dd}-%i.log.gz
    appender.file.layout.type=PatternLayout
    appender.file.layout.pattern = %d{yyyy-MM-dd HH:mm:ss} [%thread] %level{length=10} %c{1}:%L - %m%n
    appender.file.policies.type=Policies
    appender.file.policies.time.type=TimeBasedTriggeringPolicy
    appender.file.policies.time.interval=1
    appender.file.policies.time.modulate=true
    appender.file.policies.size.type=SizeBasedTriggeringPolicy
    appender.file.policies.size.size=50MB
    appender.file.strategy.type=DefaultRolloverStrategy
    appender.file.strategy.max=100
    
    rootLogger.appenderRefs=file
    rootLogger.appenderRef.file.ref=RollingFile
    
    #Hazelcast specific logs.
    
    #log4j.logger.com.hazelcast=debug
    
    #log4j.logger.com.hazelcast.cluster=debug
    #log4j.logger.com.hazelcast.partition=debug
    #log4j.logger.com.hazelcast.partition.InternalPartitionService=debug
    #log4j.logger.com.hazelcast.nio=debug
    #log4j.logger.com.hazelcast.hibernate=debug
  2. By default, Datadog’s integration pipeline supports the following conversion pattern:

    %d{yyyy-MM-dd HH:mm:ss} [%thread] %level{length=10} %c{1}:%L - %m%n

    Clone and edit the integration pipeline if you have a different format.

  3. Collecting logs is disabled by default in the Datadog Agent. Enable it in your datadog.yaml file:

    logs_enabled: true
  4. Add the following configuration block to your hazelcast.d/conf.yaml file. Change the path and service parameter values based on your environment. See the sample hazelcast.d/conf.yaml for all available configuration options.

    logs:
     - type: file
       path: /var/log/hazelcast.log
       source: hazelcast
       service: <SERVICE>
       log_processing_rules:
         - type: multi_line
           name: log_start_with_date
           pattern: \d{4}\.\d{2}\.\d{2}
  5. Restart the Agent.

Containerized

Metric collection

For containerized environments, see the Autodiscovery with JMX guide.

Log collection

Collecting logs is disabled by default in the Datadog Agent. To enable it, see Docker log collection.

ParameterValue
<LOG_CONFIG>{"source": "hazelcast", "service": "<SERVICE_NAME>"}

Validation

Run the Agent’s status subcommand and look for hazelcast under the JMXFetch section:

========
JMXFetch
========
  Initialized checks
  ==================
    hazelcast
      instance_name : hazelcast-localhost-9999
      message :
      metric_count : 46
      service_check_count : 0
      status : OK

Data Collected

Metrics

hazelcast.mc.license_expiration_time
(gauge)
The number of seconds until license expiration
Shown as second
hazelcast.instance.running
(gauge)
Running state
hazelcast.instance.version
(gauge)
The Hazelcast version
hazelcast.instance.member_count
(gauge)
Size of the cluster
hazelcast.instance.partition_service.partition_count
(gauge)
Partition count
hazelcast.instance.partition_service.active_partition_count
(gauge)
Active partition count
hazelcast.instance.partition_service.is_cluster_safe
(gauge)
Cluster Safe State
hazelcast.instance.partition_service.is_local_member_safe
(gauge)
LocalMember Safe State
hazelcast.instance.managed_executor_service.queue_size
(gauge)
Work queue size
hazelcast.instance.managed_executor_service.pool_size
(gauge)
Thread count of the pool
hazelcast.instance.managed_executor_service.maximum_pool_size
(gauge)
Maximum thread count of the pool
hazelcast.instance.managed_executor_service.remaining_queue_capacity
(gauge)
Remaining capacity of the work queue
hazelcast.instance.managed_executor_service.completed_task_count
(gauge)
Completed task count
hazelcast.instance.managed_executor_service.is_shutdown
(gauge)
hazelcast.instance.managed_executor_service.is_terminated
(gauge)
hazelcast.member.accepted_socket_count
(gauge)
hazelcast.member.active_count
(gauge)
hazelcast.member.active_members
(gauge)
hazelcast.member.active_members_commit_index
(gauge)
hazelcast.member.async_operations
(gauge)
hazelcast.member.available_processors
(gauge)
hazelcast.member.backup_timeout_millis
(gauge)
hazelcast.member.backup_timeouts
(gauge)
hazelcast.member.bytes_read
(gauge)
hazelcast.member.bytes_received
(gauge)
hazelcast.member.bytes_send
(gauge)
hazelcast.member.bytes_transceived
(gauge)
hazelcast.member.bytes_written
(gauge)
hazelcast.member.call_timeout_count
(gauge)
hazelcast.member.client_count
(gauge)
hazelcast.member.closed_count
(gauge)
hazelcast.member.cluster_start_time
(gauge)
hazelcast.member.cluster_time
(gauge)
hazelcast.member.cluster_time_diff
(gauge)
hazelcast.member.cluster_up_time
(gauge)
hazelcast.member.commit_count
(gauge)
hazelcast.member.committed_heap
(gauge)
hazelcast.member.committed_native
(gauge)
hazelcast.member.committed_virtual_memory_size
(gauge)
hazelcast.member.completed_count
(gauge)
hazelcast.member.completed_migrations
(gauge)
hazelcast.member.completed_operation_batch_count
(gauge)
hazelcast.member.completed_operation_count
(gauge)
hazelcast.member.completed_packet_count
(gauge)
hazelcast.member.completed_partition_specific_runnable_count
(gauge)
hazelcast.member.completed_runnable_count
(gauge)
hazelcast.member.completed_task_count
(gauge)
hazelcast.member.completed_tasks
(gauge)
hazelcast.member.completed_total_count
(gauge)
hazelcast.member.connection_listener_count
(gauge)
hazelcast.member.connection_type
(gauge)
hazelcast.member.count
(gauge)
hazelcast.member.created_count
(gauge)
hazelcast.member.daemon_thread_count
(gauge)
hazelcast.member.delayed_execution_count
(gauge)
hazelcast.member.destroyed_count
(gauge)
hazelcast.member.destroyed_group_ids
(gauge)
hazelcast.member.elapsed_destination_commit_time
(gauge)
hazelcast.member.elapsed_migration_operation_time
(gauge)
hazelcast.member.elapsed_migration_time
(gauge)
hazelcast.member.error_count
(gauge)
hazelcast.member.event_count
(gauge)
hazelcast.member.event_queue_size
(gauge)
hazelcast.member.events_processed
(gauge)
hazelcast.member.exception_count
(gauge)
hazelcast.member.failed_backups
(gauge)
hazelcast.member.frames_transceived
(gauge)
hazelcast.member.free_heap
(gauge)
hazelcast.member.free_memory
(gauge)
hazelcast.member.free_native
(gauge)
hazelcast.member.free_physical
(gauge)
hazelcast.member.free_physical_memory_size
(gauge)
hazelcast.member.free_space
(gauge)
hazelcast.member.free_swap_space_size
(gauge)
hazelcast.member.generic_priority_queue_size
(gauge)
hazelcast.member.generic_queue_size
(gauge)
hazelcast.member.generic_thread_count
(gauge)
hazelcast.member.groups
(gauge)
hazelcast.member.heartbeat_broadcast_period_millis
(gauge)
hazelcast.member.heartbeat_packets_received
(gauge)
hazelcast.member.heartbeat_packets_sent
(gauge)
hazelcast.member.idle_time_millis
(gauge)
hazelcast.member.idle_time_ms
(gauge)
hazelcast.member.imbalance_detected_count
(gauge)
hazelcast.member.in_progress_count
(gauge)
hazelcast.member.invocation_scan_period_millis
(gauge)
hazelcast.member.invocation_timeout_millis
(gauge)
hazelcast.member.invocations.last_call_id
(gauge)
hazelcast.member.invocations.pending
(gauge)
hazelcast.member.invocations.used_percentage
(gauge)
hazelcast.member.io_thread_id
(gauge)
hazelcast.member.last_heartbeat
(gauge)
hazelcast.member.last_repartition_time
(gauge)
hazelcast.member.listener_count
(gauge)
hazelcast.member.loaded_classes_count
(gauge)
hazelcast.member.local_clock_time
(gauge)
hazelcast.member.local_partition_count
(gauge)
hazelcast.member.major_count
(gauge)
hazelcast.member.major_time
(gauge)
hazelcast.member.max_backup_count
(gauge)
hazelcast.member.max_cluster_time_diff
(gauge)
hazelcast.member.max_file_descriptor_count
(gauge)
hazelcast.member.max_heap
(gauge)
hazelcast.member.max_memory
(gauge)
hazelcast.member.max_metadata
(gauge)
hazelcast.member.max_native
(gauge)
hazelcast.member.maximum_pool_size
(gauge)
hazelcast.member.member_groups_size
(gauge)
hazelcast.member.migration_active
(gauge)
hazelcast.member.migration_completed_count
(gauge)
hazelcast.member.migration_queue_size
(gauge)
hazelcast.member.minor_count
(gauge)
hazelcast.member.minor_time
(gauge)
hazelcast.member.missing_members
(gauge)
hazelcast.member.monitor_count
(gauge)
hazelcast.member.nodes
(gauge)
hazelcast.member.normal_frames_read
(gauge)
hazelcast.member.normal_frames_written
(gauge)
hazelcast.member.normal_pending_count
(gauge)
hazelcast.member.normal_timeouts
(gauge)
hazelcast.member.open_file_descriptor_count
(gauge)
hazelcast.member.opened_count
(gauge)
hazelcast.member.operation_timeout_count
(gauge)
hazelcast.member.owner_id
(gauge)
hazelcast.member.packets_received
(gauge)
hazelcast.member.packets_send
(gauge)
hazelcast.member.park_queue_count
(gauge)
hazelcast.member.partition_thread_count
(gauge)
hazelcast.member.peak_thread_count
(gauge)
hazelcast.member.planned_migrations
(gauge)
hazelcast.member.pool_size
(gauge)
hazelcast.member.priority_frames_read
(gauge)
hazelcast.member.priority_frames_transceived
(gauge)
hazelcast.member.priority_frames_written
(gauge)
hazelcast.member.priority_pending_count
(gauge)
hazelcast.member.priority_queue_size
(gauge)
hazelcast.member.priority_write_queue_size
(gauge)
hazelcast.member.process_count
(gauge)
hazelcast.member.process_cpu_load
(gauge)
hazelcast.member.process_cpu_time
(gauge)
hazelcast.member.proxy_count
(gauge)
hazelcast.member.publication_count
(gauge)
hazelcast.member.queue_capacity
(gauge)
hazelcast.member.queue_size
(gauge)
hazelcast.member.rejected_count
(gauge)
hazelcast.member.remaining_queue_capacity
(gauge)
hazelcast.member.replica_sync_requests_counter
(gauge)
hazelcast.member.replica_sync_semaphore
(gauge)
hazelcast.member.response_queue_size
(gauge)
hazelcast.member.responses.backup_count
(gauge)
hazelcast.member.responses.error_count
(gauge)
hazelcast.member.responses.missing_count
(gauge)
hazelcast.member.responses.normal_count
(gauge)
hazelcast.member.responses.timeout_count
(gauge)
hazelcast.member.retry_count
(gauge)
hazelcast.member.rollback_count
(gauge)
hazelcast.member.running_count
(gauge)
hazelcast.member.running_generic_count
(gauge)
hazelcast.member.running_partition_count
(gauge)
hazelcast.member.scheduled
(gauge)
hazelcast.member.selector_i_o_exception_count
(gauge)
hazelcast.member.selector_rebuild_count
(gauge)
hazelcast.member.selector_recreate_count
(gauge)
hazelcast.member.size
(gauge)
hazelcast.member.start_count
(gauge)
hazelcast.member.started_migrations
(gauge)
hazelcast.member.state_version
(gauge)
hazelcast.member.sync_delivery_failure_count
(gauge)
hazelcast.member.system_cpu_load
(gauge)
hazelcast.member.system_load_average
(gauge)
hazelcast.member.task_queue_size
(gauge)
hazelcast.member.terminated_raft_node_group_ids
(gauge)
hazelcast.member.text_count
(gauge)
hazelcast.member.thread_count
(gauge)
hazelcast.member.total_completed_migrations
(gauge)
hazelcast.member.total_elapsed_destination_commit_time
(gauge)
hazelcast.member.total_elapsed_migration_operation_time
(gauge)
hazelcast.member.total_elapsed_migration_time
(gauge)
hazelcast.member.total_failure_count
(gauge)
hazelcast.member.total_loaded_classes_count
(gauge)
hazelcast.member.total_memory
(gauge)
hazelcast.member.total_parked_operation_count
(gauge)
hazelcast.member.total_physical
(gauge)
hazelcast.member.total_physical_memory_size
(gauge)
hazelcast.member.total_registrations
(gauge)
hazelcast.member.total_space
(gauge)
hazelcast.member.total_started_thread_count
(gauge)
hazelcast.member.total_swap_space_size
(gauge)
hazelcast.member.unknown_time
(gauge)
hazelcast.member.unknown_count
(gauge)
hazelcast.member.unloaded_classes_count
(gauge)
hazelcast.member.uptime
(gauge)
hazelcast.member.usable_space
(gauge)
hazelcast.member.used_heap
(gauge)
hazelcast.member.used_memory
(gauge)
hazelcast.member.used_metadata
(gauge)
hazelcast.member.used_native
(gauge)
hazelcast.member.write_queue_size
(gauge)
hazelcast.imap.local_backup_count
(gauge)
Backup count
hazelcast.imap.local_backup_entry_count
(gauge)
Backup entry count
hazelcast.imap.local_backup_entry_memory_cost
(gauge)
Backup entry cost
hazelcast.imap.local_creation_time
(gauge)
Creation time
hazelcast.imap.local_dirty_entry_count
(gauge)
Dirty entry count
hazelcast.imap.local_event_operation_count
(gauge)
Event count
hazelcast.imap.local_get_operation_count
(gauge)
Get operation count
hazelcast.imap.local_heap_cost
(gauge)
Heap Cost
hazelcast.imap.local_hits
(gauge)
Hits
hazelcast.imap.local_last_access_time
(gauge)
Last access time
hazelcast.imap.local_last_update_time
(gauge)
Last update time
hazelcast.imap.local_locked_entry_count
(gauge)
Locked entry count
hazelcast.imap.local_max_get_latency
(gauge)
Max get latency
hazelcast.imap.local_max_put_latency
(gauge)
Max put latency
hazelcast.imap.local_max_remove_latency
(gauge)
Max remove latency
hazelcast.imap.local_other_operation_count
(gauge)
Other (keySet,entrySet etc..) operation count
hazelcast.imap.local_owned_entry_count
(gauge)
Owned entry count
hazelcast.imap.local_owned_entry_memory_cost
(gauge)
Owned entry memory cost
hazelcast.imap.local_put_operation_count
(gauge)
Put operation count
hazelcast.imap.local_remove_operation_count
(gauge)
Remove operation count
hazelcast.imap.local_total
(gauge)
Total operation count
hazelcast.imap.local_total_get_latency
(gauge)
Total get latency
hazelcast.imap.local_total_put_latency
(gauge)
Total put latency
hazelcast.imap.local_total_remove_latency
(gauge)
Total remove latency
hazelcast.imap.size
(gauge)
Size
hazelcast.multimap.local_backup_count
(gauge)
Backup count
hazelcast.multimap.local_backup_entry_count
(gauge)
Backup entry count
hazelcast.multimap.local_backup_entry_memory_cost
(gauge)
Backup entry cost
hazelcast.multimap.local_creation_time
(gauge)
Creation time
hazelcast.multimap.local_event_operation_count
(gauge)
Event count
hazelcast.multimap.local_get_operation_count
(gauge)
Get operation count
hazelcast.multimap.local_hits
(gauge)
Hits
hazelcast.multimap.local_last_access_time
(gauge)
Last access time
hazelcast.multimap.local_last_update_time
(gauge)
Last update time
hazelcast.multimap.local_locked_entry_count
(gauge)
Locked entry count
hazelcast.multimap.local_max_get_latency
(gauge)
Max get latency
hazelcast.multimap.local_max_put_latency
(gauge)
Max put latency
hazelcast.multimap.local_max_remove_latency
(gauge)
Max remove latency
hazelcast.multimap.local_other_operation_count
(gauge)
Other (keySet,entrySet etc..) operation count
hazelcast.multimap.local_owned_entry_count
(gauge)
Owned entry count
hazelcast.multimap.local_owned_entry_memory_cost
(gauge)
Owned entry memory cost
hazelcast.multimap.local_put_operation_count
(gauge)
Put operation count
hazelcast.multimap.local_remove_operation_count
(gauge)
Remove operation count
hazelcast.multimap.local_total
(gauge)
Total operation count
hazelcast.multimap.local_total_get_latency
(gauge)
Total get latency
hazelcast.multimap.local_total_put_latency
(gauge)
Total put latency
hazelcast.multimap.local_total_remove_latency
(gauge)
Total remove latency
hazelcast.multimap.size
(gauge)
Size
hazelcast.replicatedmap.local_creation_time
(gauge)
Creation time
hazelcast.replicatedmap.local_event_operation_count
(gauge)
Event count
hazelcast.replicatedmap.local_get_operation_count
(gauge)
Get operation count
hazelcast.replicatedmap.local_hits
(gauge)
Hits
hazelcast.replicatedmap.local_last_access_time
(gauge)
Last access time
hazelcast.replicatedmap.local_last_update_time
(gauge)
Last update time
hazelcast.replicatedmap.local_max_get_latency
(gauge)
Max get latency
hazelcast.replicatedmap.local_max_put_latency
(gauge)
Max put latency
hazelcast.replicatedmap.local_max_remove_latency
(gauge)
Max remove latency
hazelcast.replicatedmap.local_other_operation_count
(gauge)
Other (keySet,entrySet etc..) operation count
hazelcast.replicatedmap.local_owned_entry_count
(gauge)
Owned entry count
hazelcast.replicatedmap.local_put_operation_count
(gauge)
Put operation count
hazelcast.replicatedmap.local_remove_operation_count
(gauge)
Remove operation count
hazelcast.replicatedmap.local_total
(gauge)
Total operation count
hazelcast.replicatedmap.local_total_get_latency
(gauge)
Total get latency
hazelcast.replicatedmap.local_total_put_latency
(gauge)
Total put latency
hazelcast.replicatedmap.local_total_remove_latency
(gauge)
Total remove latency
hazelcast.replicatedmap.size
(gauge)
Size

Service Checks

hazelcast.can_connect:
Returns CRITICAL if the Agent is unable to connect to and collect metrics from the monitored Hazelcast instance, otherwise returns OK.

hazelcast.mc_cluster_state:
Represents the state of the Hazelcast Management Center as indicated by its health check.

Events

Hazelcast does not include any events.

Troubleshooting

Need help? Contact Datadog support.