Redis Enterprise

Supported OS Linux Windows Mac OS

Integration version1.0.0

Overview

Redis is a versatile and fast data store that supports strings, hashes, lists, sets, streams, and more. It also offers programmability, extensibility, persistence, clustering, and high availability features, as well as Redis Stack for modern data models and processing engines.

The Redis Enterprise integration is intended for use in on-premsises or private cloud installations running the enterprise version of Redis software. It is not for use with Redis Cloud installations, for that see the separate Redis Cloud integration.

This integration provides metrics for three critical aspects of a cluster; database, node, and shard, and makes it easier to detect problems before they become critical. For a full list see the ‘Metrics’ section.

Setup

Installation

  1. Run the following command to install the Agent integration:

    datadog-agent integration install -t datadog-redis_enterprise==1.0.0
    
  2. Configure the integration by setting the openmetrics_endpoint to your cluster’s master node. See Integration for further information.

  3. Restart the agent.

Configuration

Set the openmetrics_endpoint to point to your cluster. See the example.

Validation

  1. Ensure you can ping the machine, particularly in a cloud environment. Run wget --no-check-certificate <endpoint> or curl -k <endpoint> to ensure that you can receive metrics.

  2. Check the status of the Datadog agent.

Data Collected

The current release gathers all metrics for databases, nodes, and shard.

Metrics

rdse.bdb_avg_latency
(rate)
Average latency of operations on the DB (seconds); returned only when there is traffic
Shown as second
rdse.bdb_avg_latency_max
(gauge)
Highest value of average latency of operations on the DB (seconds); returned only when there is traffic
Shown as second
rdse.bdb_avg_read_latency
(rate)
Average latency of read operations (seconds); returned only when there is traffic
Shown as second
rdse.bdb_avg_read_latency_max
(gauge)
Highest value of average latency of read operations (seconds);returned only when there is traffic
Shown as second
rdse.bdb_avg_write_latency
(rate)
Average latency of write operations (seconds); returned only when there is traffic
Shown as second
rdse.bdb_avg_write_latency_max
(gauge)
Highest value of average latency of write operations (seconds); returned only when there is traffic
Shown as second
rdse.bdb_conns
(count)
Number of client connections to DB
Shown as connection
rdse.bdb_egress_bytes
(count)
Rate of outgoing network traffic from the DB (bytes/sec)
Shown as second
rdse.bdb_egress_bytes_max
(gauge)
Highest value of rate of outgoing network traffic from the DB (bytes/sec)
Shown as second
rdse.bdb_evicted_objects
(rate)
Rate of key evictions from DB (evictions/sec)
Shown as eviction
rdse.bdb_evicted_objects_max
(rate)
Highest value of rate of key evictions from DB (evictions/sec)
Shown as eviction
rdse.bdb_expired_objects
(rate)
Rate keys expired in DB (expirations/sec)
Shown as eviction
rdse.bdb_expired_objects_max
(rate)
eventHighest value of rate keys expired in DB (expirations/sec)
Shown as eviction
rdse.bdb_fork_cpu_system
(rate)
% cores utilization in system mode for all redis shard fork child processes of this database
Shown as cpu
rdse.bdb_fork_cpu_system_max
(rate)
Highest value of % cores utilization in system mode for all redis shard fork child processes of this database
Shown as cpu
rdse.bdb_fork_cpu_user
(rate)
% cores utilization in user mode for all redis shard fork child processes of this database
Shown as cpu
rdse.bdb_fork_cpu_user_max
(rate)
Highest value of % cores utilization in user mode for all redis shard fork child processes of this database
Shown as cpu
rdse.bdb_ingress_bytes
(count)
Rate of incoming network traffic to DB (bytes/sec)
Shown as second
rdse.bdb_ingress_bytes_max
(count)
Highest value of rate of incoming network traffic to DB (bytes/sec)
Shown as second
rdse.bdb_instantaneous_ops_per_sec
(count)
Request rate handled by all shards of DB (ops/sec)
Shown as operation
rdse.bdb_main_thread_cpu_system
(rate)
% cores utilization in system mode for all redis shard main threas of this database
Shown as cpu
rdse.bdb_main_thread_cpu_system_max
(rate)
Highest value of % cores utilization in system mode for all redis shard main threas of this database
Shown as cpu
rdse.bdb_main_thread_cpu_user
(rate)
% cores utilization in user mode for all redis shard main threads of this database
Shown as cpu
rdse.bdb_main_thread_cpu_user_max
(rate)
Highest value of % cores utilization in user mode for all redis shard main threads of this database
Shown as cpu
rdse.bdb_mem_frag_ratio
(rate)
RAM fragmentation ratio (RSS / allocated RAM)
Shown as percent
rdse.bdb_mem_size_lua
(gauge)
Redis lua scripting heap size (bytes)
Shown as byte
rdse.bdb_memory_limit
(gauge)
Configured RAM limit for the database (bytes)
Shown as byte
rdse.bdb_monitor_sessions_count
(count)
Number of client connected in monitor mode to the DB
Shown as session
rdse.bdb_no_of_keys
(count)
Number of keys in DB
Shown as key
rdse.bdb_other_req
(rate)
Rate of other (non read/write) requests on DB (ops/sec)
Shown as request
rdse.bdb_other_req_max
(rate)
Highest value of rate of other (non read/write) requests on DB (ops/sec)
Shown as request
rdse.bdb_other_res
(rate)
Rate of other (non read/write) responses on DB (ops/sec)
Shown as response
rdse.bdb_other_res_max
(rate)
Highest value of rate of other (non read/write) responses on DB (ops/sec)
Shown as response
rdse.bdb_pubsub_channels
(count)
Count the pub/sub channels with subscribed clients
Shown as key
rdse.bdb_pubsub_channels_max
(count)
Highest value of count the pub/sub channels with subscribed clients
Shown as key
rdse.bdb_pubsub_patterns
(count)
Count the pub/sub patterns with subscribed clients
Shown as key
rdse.bdb_pubsub_patterns_max
(count)
Highest value of count the pub/sub patterns with subscribed clients
Shown as key
rdse.bdb_read_hits
(count)
Rate of read operations accessing an existing key (ops/sec)
Shown as hit
rdse.bdb_read_hits_max
(count)
Highest value of rate of read operations accessing an existing key (ops/sec)
Shown as hit
rdse.bdb_read_misses
(count)
Rate of read operations accessing a non-existing key (ops/sec)
Shown as miss
rdse.bdb_read_misses_max
(count)
Highest value of rate of read operations accessing a non-existing key (ops/sec)
Shown as miss
rdse.bdb_read_req
(rate)
Rate of read requests on DB (ops/sec)
Shown as request
rdse.bdb_read_req_max
(rate)
Highest value of rate of read requests on DB (ops/sec)
Shown as request
rdse.bdb_read_res
(rate)
Rate of read responses on DB (ops/sec)
Shown as response
rdse.bdb_read_res_max
(rate)
Highest value of rate of read responses on DB (ops/sec)
Shown as response
rdse.bdb_shard_cpu_system
(rate)
% cores utilization in system mode for all redis shard processes of this database
Shown as cpu
rdse.bdb_shard_cpu_system_max
(rate)
Highest value of % cores utilization in system mode for all redis shard processes of this database
Shown as cpu
rdse.bdb_shard_cpu_user
(rate)
% cores utilization in user mode for the redis shard process
Shown as cpu
rdse.bdb_shard_cpu_user_max
(rate)
Highest value of % cores utilization in user mode for the redis shard process
Shown as cpu
rdse.bdb_total_connections_received
(count)
Rate of new client connections to DB (connections/sec)
Shown as connection
rdse.bdb_total_connections_received_max
(count)
Highest value of rate of new client connections to DB (connections/sec)
Shown as connection
rdse.bdb_total_req
(count)
Rate of all requests on DB (ops/sec)
Shown as request
rdse.bdb_total_req_max
(count)
Highest value of rate of all requests on DB (ops/sec)
Shown as request
rdse.bdb_total_res
(count)
Rate of all responses on DB (ops/sec)
Shown as response
rdse.bdb_total_res_max
(count)
Highest value of rate of all responses on DB (ops/sec)
Shown as response
rdse.bdb_up
(gauge)
Database is up and running
Shown as service
rdse.bdb_used_memory
(count)
Memory used by db (in bigredis this includes flash) (bytes)
Shown as byte
rdse.bdb_write_hits
(count)
Rate of write operations accessing an existing key (ops/sec)
Shown as hit
rdse.bdb_write_hits_max
(count)
Highest value of rate of write operations accessing an existing key (ops/sec)
Shown as hit
rdse.bdb_write_misses
(count)
Rate of write operations accessing a non-existing key (ops/sec)
Shown as miss
rdse.bdb_write_misses_max
(count)
Highest value of rate of write operations accessing a non-existing key (ops/sec)
Shown as miss
rdse.bdb_write_req
(rate)
Rate of write requests on DB (ops/sec)
Shown as request
rdse.bdb_write_req_max
(rate)
Highest value of rate of write requests on DB (ops/sec)
Shown as request
rdse.bdb_write_res
(rate)
Rate of write responses on DB (ops/sec)
Shown as response
rdse.bdb_write_res_max
(rate)
Highest value of rate of write responses on DB (ops/sec)
Shown as response
rdse.no_of_expires
(count)
Current number of volatile keys in the database
Shown as key
rdse.node_available_flash
(count)
Available flash in node (bytes)
Shown as byte
rdse.node_available_flash_no_overbooking
(count)
Available flash in node (bytes), without taking into account overbooking
Shown as byte
rdse.node_available_memory
(count)
Amount of free memory in node (bytes) that is available for database provisioning
Shown as byte
rdse.node_available_memory_no_overbooking
(count)
Available ram in node (bytes) without taking into account overbooking
Shown as byte
rdse.node_avg_latency
(rate)
Average latency of requests handled by endpoints on node (seconds); returned only when there is traffic
Shown as second
rdse.node_bigstore_free
(count)
Sum of free space of back-end flash (used by flash DB’s [BigRedis]) on all cluster nodes (bytes); returned only when BigRedis is enabled
Shown as byte
rdse.node_bigstore_iops
(rate)
Rate of i/o operations against back-end flash for all shards which are part of a flash based DB (BigRedis) in cluster (ops/sec)
Shown as operation
rdse.node_bigstore_kv_ops
(count)
Rate of value read/write operations against back-end flash for all shards which are part of a flash based DB (BigRedis) in cluster (ops/sec)
Shown as operation
rdse.node_bigstore_throughput
(rate)
Throughput i/o operations against back-end flash for all shards which are part of a flash based DB (BigRedis) in cluster (bytes/sec)
Shown as operation
rdse.node_conns
(count)
Number of clients connected to endpoints on node
Shown as connection
rdse.node_cpu_idle
(rate)
CPU idle time portion (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_idle_max
(gauge)
Highest value of CPU idle time portion (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_idle_median
(gauge)
Average value of CPU idle time portion (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_idle_min
(gauge)
Lowest value of CPU idle time portion (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_iowait
(rate)
CPU IO wait time portion (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_iowait_max
(gauge)
Highest value of CPU IO wait time portion (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_iowait_median
(gauge)
Average value of CPU IO wait time portion (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_iowait_min
(gauge)
Lowest value of CPU IO wait time portion (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_irqs
(rate)
CPU IRQ time portion (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_irqs_max
(gauge)
Highest value of CPU IRQ time portion (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_irqs_median
(gauge)
Average value of CPU IRQ time portion (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_irqs_min
(gauge)
Lowest value of CPU IRQ time portion (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_nice
(rate)
CPU nice time portion (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_nice_max
(gauge)
Highest value of CPU nice time portion (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_nice_median
(gauge)
Average value of CPU nice time portion (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_nice_min
(gauge)
Lowest value of CPU nice time portion (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_steal
(rate)
CPU time portion spent in kernel (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_steal_max
(gauge)
Highest value of CPU time portion spent in kernel (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_steal_median
(gauge)
Average value of CPU time portion spent in kernel (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_steal_min
(gauge)
Lowest value of CPU time portion spent in kernel (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_system
(rate)
CPU time portion spent by kernel-space processes (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_system_max
(gauge)
Highest value of CPU time portion spent by kernel-space processes (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_system_median
(gauge)
Average value of CPU time portion spent by kernel-space processes (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_system_min
(gauge)
Lowest value of CPU time portion spent by kernel-space processes (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_user
(rate)
CPU time portion spent by user-space processes (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_user_max
(gauge)
Highest value of CPU time portion spent by user-space processes (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_user_median
(gauge)
Average value of CPU time portion spent by user-space processes (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cpu_user_min
(gauge)
Lowest value of CPU time portion spent by user-space processes (0-1, multiply by 100 to get percent)
Shown as cpu
rdse.node_cur_aof_rewrites
(count)
Number of aof rewrites that are currently performed by shards on this node
Shown as write
rdse.node_egress_bytes
(rate)
Rate of outgoing network traffic to node (bytes/sec)
Shown as byte
rdse.node_egress_bytes_max
(gauge)
Highest value of rate of outgoing network traffic to node (bytes/sec)
Shown as byte
rdse.node_egress_bytes_median
(gauge)
Average value of rate of outgoing network traffic to node (bytes/sec)
Shown as byte
rdse.node_egress_bytes_min
(gauge)
Lowest value of rate of outgoing network traffic to node (bytes/sec)
Shown as byte
rdse.node_ephemeral_storage_avail
(count)
Disk space available to RLEC processes on configured ephemeral disk (bytes)
Shown as byte
rdse.node_ephemeral_storage_free
(count)
Free disk space on configured ephemeral disk (bytes)
Shown as byte
rdse.node_free_memory
(count)
Free memory in node (bytes)
Shown as byte
rdse.node_ingress_bytes
(rate)
Rate of incoming network traffic to node (bytes/sec)
Shown as byte
rdse.node_ingress_bytes_max
(gauge)
Highest value of rate of incoming network traffic to node (bytes/sec)
Shown as byte
rdse.node_ingress_bytes_median
(gauge)
Average value of rate of incoming network traffic to node (bytes/sec)
Shown as byte
rdse.node_ingress_bytes_min
(gauge)
Lowest value of rate of incoming network traffic to node (bytes/sec)
Shown as byte
rdse.node_persistent_storage_avail
(count)
Disk space available to RLEC processes on configured persistent disk (bytes)
Shown as byte
rdse.node_persistent_storage_free
(count)
Free disk space on configured persistent disk (bytes)
Shown as byte
rdse.node_provisional_flash
(count)
Amount of flash available for new shards on this node, taking into account overbooking, max redis servers, reserved flash and provision and migration thresholds
Shown as byte
rdse.node_provisional_flash_no_overbooking
(count)
Amount of flash available for new shards on this node, without taking into account overbooking, max redis servers, reserved flash and provision and migration thresholds
Shown as byte
rdse.node_provisional_memory
(count)
Amount of RAM that is available for provisioning to databases out of the total RAM allocated for databases
Shown as byte
rdse.node_provisional_memory_no_overbooking
(count)
Amount of RAM that is available for provisioning to databases out of the total RAM allocated for databases, without taking into account overbooking
Shown as byte
rdse.node_total_req
(rate)
Request rate handled by endpoints on node (ops/sec)
Shown as request
rdse.node_up
(rate)
Node is part of the cluster and is connected
Shown as service
rdse.redis_active_defrag_running
(rate)
Automatic memory defragmentation current aggressiveness (% cpu)
Shown as cpu
rdse.redis_allocator_active
(count)
Total used memory including external fragmentation
Shown as byte
rdse.redis_allocator_allocated
(count)
Total allocated memory
Shown as byte
rdse.redis_allocator_resident
(count)
Total resident memory (RSS)
Shown as byte
rdse.redis_aof_last_cow_size
(count)
Last AOFR, CopyOnWrite memory
Shown as byte
rdse.redis_aof_rewrite_in_progress
(count)
The number of simultaneous AOF rewrites that are in progress
Shown as check
rdse.redis_aof_rewrites
(count)
Number of AOF rewrites this process executed
Shown as write
rdse.redis_aof_delayed_fsync
(count)
Number of times an AOF fsync caused delays in the redis main thread (inducing latency); This can indicate that the disk is slow or overloaded
Shown as operation
rdse.redis_blocked_clients
(count)
Count the clients waiting on a blocking call
Shown as connection
rdse.redis_connected_clients
(count)
Number of client connections to the specific shard
Shown as connection
rdse.redis_connected_slaves
(count)
Number of connected slaves
Shown as connection
rdse.redis_db0_avg_ttl
(rate)
Average TTL of all volatile keys
Shown as key
rdse.redis_db0_expires
(count)
Total count of volatile keys
Shown as key
rdse.redis_db0_keys
(count)
Total key count
Shown as key
rdse.redis_evicted_keys
(count)
Keys evicted so far (since restart)
Shown as key
rdse.redis_expire_cycle_cpu_milliseconds
(count)
The cumulative amount of time spent on active expiry cycles
Shown as second
rdse.redis_expired_keys
(count)
Keys expired so far (since restart)
Shown as key
rdse.redis_forwarding_state
(count)
Shard forwarding state (on or off)
Shown as check
rdse.redis_keys_trimmed
(count)
The number of keys that were trimmed in the current or last resharding process
Shown as key
rdse.redis_keyspace_read_hits
(count)
Number of read operations accessing an existing keyspace
Shown as hit
rdse.redis_keyspace_read_misses
(count)
Number of read operations accessing an non-existing keyspace
Shown as miss
rdse.redis_keyspace_write_hits
(count)
Number of write operations accessing an existing keyspace
Shown as hit
rdse.redis_keyspace_write_misses
(count)
Number of write operations accessing an non-existing keyspace
Shown as miss
rdse.redis_master_link_status
(rate)
Indicates if the replica is connected to its master
Shown as check
rdse.redis_master_repl_offset
(count)
Number of bytes sent to replicas by the shard; Calculate the throughput for a time period by comparing the value at different times
Shown as byte
rdse.redis_master_sync_in_progress
(rate)
The master shard is synchronizing (1 true
Shown as check
rdse.redis_max_process_mem
(count)
Current memory limit configured by redis_mgr according to node free memory
Shown as byte
rdse.redis_maxmemory
(count)
Current memory limit configured by redis_mgr according to db memory limits
Shown as byte
rdse.redis_mem_aof_buffer
(count)
Current size of AOF buffer
Shown as byte
rdse.redis_mem_clients_normal
(count)
Current memory used for input and output buffers of non-replica clients
Shown as byte
rdse.redis_mem_clients_slaves
(count)
Current memory used for input and output buffers of replica clients
Shown as byte
rdse.redis_mem_fragmentation_ratio
(rate)
Memory fragmentation ratio (1.3 means 30% overhead)
Shown as percent
rdse.redis_mem_not_counted_for_evict
(rate)
Portion of used_memory (in bytes) that’s not counted for eviction and OOM error
Shown as byte
rdse.redis_mem_replication_backlog
(rate)
Size of replication backlog
Shown as byte
rdse.redis_module_fork_in_progress
(rate)
A binary value that indicates if there is an active fork spawned by a module (1) or not (0)
Shown as check
rdse.redis_process_cpu_system_seconds_total
(count)
Shard Process system CPU time spent in seconds
Shown as cpu
rdse.redis_process_cpu_usage_percent
(rate)
Shard Process cpu usage precentage
Shown as cpu
rdse.redis_process_cpu_user_seconds_total
(count)
Shard user CPU time spent in seconds
Shown as cpu
rdse.redis_process_main_thread_cpu_system_seconds_total
(count)
Shard main thread system CPU time spent in seconds
Shown as cpu
rdse.redis_process_main_thread_cpu_user_seconds_total
(count)
Shard main thread user CPU time spent in seconds
Shown as second
rdse.redis_process_max_fds
(gauge)
Shard Maximum number of open file descriptors
Shown as file
rdse.redis_process_open_fds
(gauge)
Shard Number of open file descriptors
Shown as file
rdse.redis_process_resident_memory_bytes
(count)
Shard Resident memory size in bytes
Shown as byte
rdse.redis_process_start_time_seconds
(count)
Shard Start time of the process since unix epoch in seconds
Shown as second
rdse.redis_process_virtual_memory_bytes
(count)
Shard virtual memory in bytes
Shown as byte
rdse.redis_rdb_bgsave_in_progress
(rate)
Indication if bgsave is currently in progress
Shown as check
rdse.redis_rdb_last_cow_size
(rate)
Last bgsave (or SYNC fork) used CopyOnWrite memory
Shown as byte
rdse.redis_rdb_saves
(count)
Total count of bgsaves since process was restarted (including replica fullsync and persistence)
Shown as operation
rdse.redis_repl_touch_bytes
(count)
Number of bytes sent to replicas as TOUCH commands by the shard as a result of a READ command that was processe
Shown as byte
rdse.redis_total_commands_processed
(count)
Number of commands processed by the shard; Calculate the number of commands for a time period by comparing the value at different times
Shown as command
rdse.redis_total_connections_received
(count)
Number of connections received by the shard; Calculate the number of connections for a time period by comparing the value at different times
Shown as connection
rdse.redis_total_net_input_bytes
(count)
Number of bytes received by the shard; Calculate the throughput for a time period by comparing the value at different times
Shown as request
rdse.redis_total_net_output_bytes
(count)
Number of bytes sent by the shard; Calculate the throughput for a time period by comparing the value at different times
Shown as response
rdse.redis_up
(rate)
Shard is up and running
Shown as service
rdse.redis_used_memory
(count)
Memory used by shard (in bigredis this includes flash) (bytes)
Shown as byte

Service Checks

Redis Enterprise does not include any service checks.

Events

Redis Enterprise does not include any events.

Troubleshooting

Need help? Contact Redis Field Engineering.