Datadog-Redis Integration

Redis default dashboard

Overview

Whether you use Redis as a database, cache, or message queue, this integration helps you track problems with your Redis servers and the parts of your infrastructure that they serve. The Datadog Agent’s Redis check collects a wealth of metrics related to performance, memory usage, blocked clients, slave connections, disk persistence, expired and evicted keys, and many more.

Setup

Installation

The Redis check is packaged with the Agent, so simply install the Agent on your Redis servers. If you need the newest version of the check, install the dd-check-redis package.

Configuration

Create a redisdb.yaml in the Datadog Agent’s conf.d directory. See the sample redisdb.yaml for all available configuration options:

init_config:

instances:
  - host: localhost
    port: 6379 # or wherever your redis listens
#   unix_socket_path: /var/run/redis/redis.sock # if your redis uses a socket instead of TCP
#   password: myredispassword                   # if your redis requires auth

Configuration Options:

  • unix_socket_path - (Optional) - Can be used instead of host and port.
  • db, password, and socket_timeout - (Optional) - Additional connection options.
  • warn_on_missing_keys - (Optional) - Display a warning in the info page if the keys we’re tracking are missing.
  • slowlog-max-len - (Optional) - Maximum number of entries to fetch from the slow query log. By default, the check will read this value from the redis config. If it’s above 128, it will default to 128 due to potential increased latency to retrieve more than 128 slowlog entries every 15 seconds. If you need to get more entries from the slow query logs set the value here. Warning: It may impact the performance of your redis instance
  • command_stats - (Optional) - Collect INFO COMMANDSTATS output as metrics.

See this sample redisdb.yaml for all available configuration options.

Restart the Agent to begin sending Redis metrics to Datadog.

Validation

Run the Agent’s info subcommand and look for redisdb under the Checks section:

  Checks
  ======
    [...]

    redisdb
    -------
      - instance #0 [OK]
      - Collected 26 metrics, 0 events & 1 service check

    [...]

Compatibility

The redis check is compatible with all major platforms.

Data Collected

Metrics

redis.aof.buffer_length
(gauge)
Size of the AOF buffer.
shown as byte
redis.aof.last_rewrite_time
(gauge)
Duration of the last AOF rewrite.
shown as second
redis.aof.rewrite
(gauge)
Flag indicating a AOF rewrite operation is on-going.
shown as
redis.aof.size
(gauge)
AOF current file size (aof_current_size).
shown as byte
redis.clients.biggest_input_buf
(gauge)
The biggest input buffer among current client connections.
shown as
redis.clients.blocked
(gauge)
The number of connections waiting on a blocking call.
shown as connection
redis.clients.longest_output_list
(gauge)
The longest output list among current client connections.
shown as
redis.cpu.sys
(gauge)
System CPU consumed by the Redis server.
shown as second
redis.cpu.sys_children
(gauge)
System CPU consumed by the background processes.
shown as second
redis.cpu.user
(gauge)
User CPU consumed by the Redis server.
shown as second
redis.cpu.user_children
(gauge)
User CPU consumed by the background processes.
shown as second
redis.expires
(gauge)
The number of keys that have expired.
shown as key
redis.expires.percent
(gauge)
Percentage of total keys that have been expired.
shown as percent
redis.info.latency_ms
(gauge)
The latency of the redis INFO command.
shown as millisecond
redis.key.length
(gauge)
The number of elements in a given key, tagged by key, e.g. 'key:mykeyname'. Enable in Agent's redisdb.yaml with the keys option.
shown as
redis.keys
(gauge)
The total number of keys.
shown as key
redis.keys.evicted
(gauge)
The total number of keys evicted due to the maxmemory limit.
shown as key
redis.keys.expired
(gauge)
The total number of keys expired from the db.
shown as key
redis.mem.fragmentation_ratio
(gauge)
Ratio between used_memory_rss and used_memory.
shown as fraction
redis.mem.lua
(gauge)
Amount of memory used by the Lua engine.
shown as byte
redis.mem.maxmemory
()
-1
shown as
redis.mem.peak
(gauge)
The peak amount of memory used by Redis.
shown as byte
redis.mem.rss
(gauge)
Amount of memory that Redis allocated as seen by the os.
shown as byte
redis.mem.used
(gauge)
Amount of memory allocated by Redis.
shown as byte
redis.net.clients
(gauge)
The number of connected clients (excluding slaves).
shown as connection
redis.net.commands
(gauge)
The number of commands processed by the server.
shown as command
redis.net.commands.instantaneous_ops_per_sec
(gauge)
0
shown as second
redis.net.rejected
(gauge)
The number of rejected connections.
shown as connection
redis.net.slaves
(gauge)
The number of connected slaves.
shown as connection
redis.perf.latest_fork_usec
(gauge)
The duration of the latest fork.
shown as microsecond
redis.persist
(gauge)
The number of keys persisted (redis.keys - redis.expires).
shown as key
redis.persist.percent
(gauge)
Percentage of total keys that are persisted.
shown as percent
redis.pubsub.channels
(gauge)
The number of active pubsub channels.
shown as
redis.pubsub.patterns
(gauge)
The number of active pubsub patterns.
shown as
redis.rdb.bgsave
(gauge)
One if a bgsave is in progress and zero otherwise.
shown as
redis.rdb.changes_since_last
(gauge)
The number of changes since the last background save.
shown as
redis.rdb.last_bgsave_time
(gauge)
Duration of the last bg_save operation.
shown as second
redis.replication.backlog_histlen
(gauge)
The amount of data in the backlog sync buffer.
shown as byte
redis.replication.delay
(gauge)
The replication delay in offsets.
shown as offset
redis.replication.last_io_seconds_ago
(gauge)
Amount of time since the last interaction with master.
shown as second
redis.replication.master_link_down_since_seconds
(gauge)
Amount of time that the master link has been down.
shown as second
redis.replication.master_repl_offset
(gauge)
The replication offset reported by the master.
shown as offset
redis.replication.slave_repl_offset
(gauge)
The replication offset reported by the slave.
shown as offset
redis.replication.sync
(gauge)
One if a sync is in progress and zero otherwise.
shown as
redis.replication.sync_left_bytes
(gauge)
Amount of data left before syncing is complete.
shown as byte
redis.slowlog.micros.95percentile
(gauge)
The 95th percentile of the duration of queries reported in the slow log.
shown as microsecond
redis.slowlog.micros.avg
(gauge)
The average duration of queries reported in the slow log.
shown as microsecond
redis.slowlog.micros.count
(rate)
The rate of queries reported in the slow log.
shown as query
redis.slowlog.micros.max
(gauge)
The maximum duration of queries reported in the slow log.
shown as microsecond
redis.slowlog.micros.median
(gauge)
The median duration of queries reported in the slow log.
shown as microsecond
redis.stats.keyspace_hits
(gauge)
The total number of successful lookups in the db.
shown as key
redis.stats.keyspace_misses
(gauge)
The total number of missed lookups in the db.
shown as key
redis.command.calls
(gauge)
The number of times a redis command has been called, tagged by 'command', e.g. 'command:append'. Enable in Agent's redisdb.yaml with the command_stats option.
shown as
redis.command.usec_per_call
(gauge)
The CPU time consumed per redis command call, tagged by 'command', e.g. 'command:append'. Enable in Agent's redisdb.yaml with the command_stats option.
shown as

Events

The RedisDB check does not include any event at this time.

Service Checks

redis.can_connect:

Returns CRITICAL if the Agent cannot connect to Redis to collect metrics, otherwise OK.

Troubleshooting

How do I filter to look at the stats for a particular DB in a particular environment?

Prebuilt dashboards only allow you to filter on a single tag (these are the dashboards you see when clicking Overview). If you go to the Metrics Explorer, you can select which metrics you want to see and what you want to see it over. In the ‘Over:’ section you can select multiple environments and then select “Save these tiles to: a new dashboard.”

Agent cannot connect

    redisdb
    -------
      - instance #0 [ERROR]: 'Error 111 connecting to localhost:6379. Connection refused.'
      - Collected 0 metrics, 0 events & 1 service chec

Check that the connection info in redisdb.yaml is correct.

Agent cannot authenticate

    redisdb
    -------
      - instance #0 [ERROR]: 'NOAUTH Authentication required.'
      - Collected 0 metrics, 0 events & 1 service check

Configure a password in redisdb.yaml.

Further Reading

Read our series of blog posts about how to monitor your Redis servers with Datadog. We detail the key performance metrics, how to collect them, and how to use Datadog to monitor Redis.