- はじめに
- エージェント
- インテグレーション
- Watchdog
- イベント
- ダッシュボード
- モバイルアプリケーション
- インフラストラクチャー
- サーバーレス
- メトリクス
- ノートブック
- アラート設定
- APM & Continuous Profiler
- CI Visibility
- RUM & セッションリプレイ
- データベース モニタリング
- ログ管理
- セキュリティプラットフォーム
- Synthetic モニタリング
- ネットワークモニタリング
- 開発者
- API
- アカウントの管理
- データセキュリティ
- ヘルプ
ElastiCache のパフォーマンスメトリクスを監視する方法については、Redis と Memcached のどちらを使用する場合でも、こちらの記事をご参照ください。ここには、キーパフォーマンスメトリクス、その収集方法、Datadog を使用して Coursera が ElastiCache を監視する方法について詳述されています。
Amazon Web Services インテグレーションをまだセットアップしていない場合は、最初にセットアップします。
AWS インテグレーションタイルのメトリクス収集で、ElastiCache
をオンにします。
Amazon ElastiCache のメトリクスを収集するために、次のアクセス許可を Datadog IAM ポリシーに追加します。ElastiCache ポリシーの詳細については、AWS Web サイトのガイドを参照してください。
AWS アクセス許可 | 説明 |
---|---|
elasticache:DescribeCacheClusters | (タグやメトリクスを追加する場合) キャッシュクラスターとその説明を一覧表示します。 |
elasticache:ListTagsForResource | (カスタムタグを追加する場合) クラスターのカスタムタグを一覧表示します。 |
elasticache:DescribeEvents | スナップショットとメンテナンスに関連するイベントを追加します。 |
Datadog - AWS ElastiCache インテグレーションをインストールします。
次の図は、Datadog がネイティブ ElastiCache インテグレーションを経由して CloudWatch からメトリクスを直接収集する方法と、バックエンド技術である Redis または Memcached から追加のネイティブメトリクスを直接収集する方法を示しています。バックエンドから直接収集することで、より多くの重要なメトリクスに、より高い精度でアクセスできます。
Agent メトリクスは、実際の ElastiCache インスタンスにではなく、Agent が実行されている EC2 インスタンスに関連付けられます。そのため、すべてのメトリクスを一緒に接続するには、cacheclusterid
タグを使用する必要があります。ElastiCache インスタンスと同じタグを使用して Agent を構成した後に、Redis/Memcached メトリクスを ElastiCache メトリクスと組み合わせる方法が簡単です。
Agent は実際の ElastiCache インスタンスではなくリモートマシンで実行されるため、メトリクスをどこから収集するかを Agent に伝えることが、このインテグレーションを正しくセットアップするカギになります。
まず、AWS コンソールに移動し、ElastiCache セクションを開き、Cache Clusters のタブに移動して、監視対象のクラスターを探します。次のように表示されます。
「ノード」リンクをクリックして、エンドポイント URL にアクセスします。
エンドポイント URL (例: replica-001.xxxx.use1.cache.amazonaws.com) と cacheclusterid
(例: replica-001) をメモします。Agent を構成したり、グラフやダッシュボードを作成したりする際に、これらの値が必要になります。
Redis/Memcached インテグレーションは、個別のキャッシュインスタンスのタグ付けをサポートします。本来は、同じマシン上の複数のインスタンスを監視するためのタグですが、このタグをメトリクスの絞り込みとグループ化に使用できます。以下は、redisdb.yaml
を使用した Redis と ElastiCache の構成例です。このファイルがプラットフォームのどこに置かれているかについては、Agent 構成ディレクトリを参照してください。
init_config:
instances:
# AWS コンソールからのエンドポイント URL
- host: replica-001.xxxx.use1.cache.amazonaws.com
port: 6379
# AWS コンソールからのキャッシュクラスター ID
tags:
- cacheclusterid:replicaa-001
sudo /etc/init.d/datadog-agent restart
(Linux の場合) で Agent を再起動します。
数分経過すると、Datadog で ElastiCache と Redis/Memcached のメトリクスにアクセスして、グラフ化、監視などを行うことができます。
以下に、同じ「cacheclusterid」タグ replicaa-001 を使用して、ElastiCache からのキャッシュヒットメトリクスと Redis からのネイティブレイテンシーメトリクスを組み合わせたグラフをセットアップする例を挙げます。
aws.elasticache.active_defrag_hits (gauge) | Redis - The number of value reallocations per minute performed by the active defragmentation process. |
aws.elasticache.bytes_read_into_memcached (count) | Memcached - The number of bytes that have been read from the network by the cache node. Shown as byte |
aws.elasticache.bytes_used_for_cache_items (gauge) | Memcached - The number of bytes used to store cache items. Shown as byte |
aws.elasticache.bytes_used_for_cache (gauge) | Redis - The total number of bytes allocated by Redis. Shown as byte |
aws.elasticache.bytes_used_for_hash (gauge) | Memcached - The number of bytes currently used by hash tables. Shown as byte |
aws.elasticache.bytes_written_out_from_memcached (count) | Memcached - The number of bytes that have been written to the network by the cache node. Shown as byte |
aws.elasticache.cache_hit_rate (gauge) | Redis - Indicates the usage efficiency of the Redis instance. Shown as percent |
aws.elasticache.cache_hits (count) | Redis - The number of successful key lookups. Shown as hit |
aws.elasticache.cache_misses (count) | Redis - The number of unsuccessful key lookups. Shown as miss |
aws.elasticache.cas_badval (count) | Memcached - The number of CAS (check and set) requests the cache has received where the Cas value did not match the Cas value stored. Shown as request |
aws.elasticache.cas_hits (count) | Memcached - The number of CAS requests the cache has received where the requested key was found and the Cas value matched. Shown as hit |
aws.elasticache.cas_misses (count) | Memcached - The number of CAS requests the cache has received where the key requested was not found. Shown as miss |
aws.elasticache.cluster_count (count) | The number of Elasticache clusters. |
aws.elasticache.cmd_config_get (count) | Memcached - The cumulative number of config get requests. Shown as get |
aws.elasticache.cmd_config_set (count) | Memcached - The cumulative number of config set requests. Shown as set |
aws.elasticache.cmd_flush (count) | Memcached - The number of flush commands the cache has received. Shown as flush |
aws.elasticache.cmd_get (count) | Memcached - The number of get commands the cache has received. Shown as get |
aws.elasticache.cmd_set (count) | Memcached - The number of set commands the cache has received. Shown as set |
aws.elasticache.cmd_touch (count) | Memcached - The cumulative number of touch requests. Shown as request |
aws.elasticache.cpucredit_balance (gauge) | The number of earned CPU credits that an instance has accrued since it was launched or started. Shown as unit |
aws.elasticache.cpucredit_usage (gauge) | The number of CPU credits spent by the instance for CPU utilization. Shown as unit |
aws.elasticache.cpuutilization (gauge) | The percentage of CPU utilization for the server. Shown as percent |
aws.elasticache.curr_config (gauge) | Memcached - The current number of configurations stored. |
aws.elasticache.curr_connections (gauge) | Redis - The number of client connections, excluding connections from read replicas. Memcached - A count of the number of connections connected to the cache at an instant in time. Shown as connection |
aws.elasticache.curr_items (gauge) | Redis - The number of items in the cache. This is derived from the Redis keyspace statistic, summing all of the keys in the entire keyspace. Memcached - A count of the number of items currently stored in the cache. Shown as item |
aws.elasticache.database_memory_usage_percentage (gauge) | Redis - The percentage of the memory available for the cluster that is in use. Shown as percent |
aws.elasticache.db_0average_ttl (gauge) | Redis - Exposes avg_ttl of DBO from the keyspace statistic of the Redis INFO command. Shown as millisecond |
aws.elasticache.decr_hits (count) | Memcached - The number of decrement requests the cache has received where the requested key was found. Shown as hit |
aws.elasticache.decr_misses (count) | Memcached - The number of decrement requests the cache has received where the requested key was not found. Shown as miss |
aws.elasticache.delete_hits (count) | Memcached - The number of delete requests the cache has received where the requested key was found. Shown as hit |
aws.elasticache.delete_misses (count) | Memcached - The number of delete requests the cache has received where the requested key was not found. Shown as miss |
aws.elasticache.engine_cpuutilization (gauge) | The percentage of CPU utilization for the Redis process. Shown as percent |
aws.elasticache.eval_based_cmds (count) | Redis - The total number of commands for eval-based commands. Shown as command |
aws.elasticache.eval_based_cmds_latency (gauge) | Redis - The latency of eval-based commands. Shown as microsecond |
aws.elasticache.evicted_unfetched (count) | Memcached - The number of valid items evicted from the least recently used cache (LRU) which were never touched after being set. Shown as item |
aws.elasticache.evictions (count) | Redis - The number of keys that have been evicted due to the maxmemory limit. Memcached - The number of non-expired items the cache evicted to allow space for new writes. Shown as eviction |
aws.elasticache.expired_unfetched (count) | Memcached - The number of expired items reclaimed from the LRU which were never touched after being set. Shown as item |
aws.elasticache.freeable_memory (gauge) | The amount of free memory available on the host. Shown as byte |
aws.elasticache.geo_spatial_based_cmds (count) | Redis - The total number of geo spatial based commands. Shown as command |
aws.elasticache.get_hits (count) | Memcached - The number of get requests the cache has received where the key requested was found. Shown as hit |
aws.elasticache.get_misses (count) | Memcached - The number of get requests the cache has received where the key requested was not found. Shown as miss |
aws.elasticache.get_type_cmds_latency (gauge) | Redis - The latency of read commands. Shown as microsecond |
aws.elasticache.get_type_cmds (count) | Redis - The total number of get types of commands. This is derived from the Redis commandstats statistic by summing all of the get types of commands (get, mget, hget, etc.) Shown as command |
aws.elasticache.hash_based_cmds_latency (gauge) | Redis - The latency of hash-based commands. Shown as microsecond |
aws.elasticache.hash_based_cmds (count) | Redis - The total number of commands that are hash-based. This is derived from the Redis commandstats statistic by summing all of the commands that act upon one or more hashes. Shown as command |
aws.elasticache.hyper_log_log_based_cmds (count) | Redis - The total number of HyperLogLog based commands. This is derived from the Redis commandstats statistic by summing all of the pf type of commands (pfadd, pfcount, pfmerge). Shown as command |
aws.elasticache.incr_hits (count) | Memcached - The number of increment requests the cache has received where the key requested was found. Shown as hit |
aws.elasticache.incr_misses (count) | Memcached - The number of increment requests the cache has received where the key requested was not found. Shown as miss |
aws.elasticache.is_master (gauge) | Redis - Returns 1 if the node is master, 0 otherwise. |
aws.elasticache.key_based_cmds_latency (gauge) | Redis - The latency of key-based commands. Shown as microsecond |
aws.elasticache.key_based_cmds (count) | Redis - The total number of commands that are key-based. This is derived from the Redis commandstats statistic by summing all of the commands that act upon one or more keys. Shown as command |
aws.elasticache.list_based_cmds (count) | Redis - The total number of commands that are list-based. This is derived from the Redis commandstats statistic by summing all of the commands that act upon one or more lists. Shown as command |
aws.elasticache.master_link_health_status (gauge) | Redis - A value of 0 indicates that data in the Elasticache primary node is not in sync with Redis on EC2. A value of 1 indicates that the data is in sync. |
aws.elasticache.memory_fragmentation_ratio (gauge) | Redis - Indicates the efficiency in the allocation of memory of the Redis engine. |
aws.elasticache.network_bytes_in (count) | The number of bytes the host has read from the network. Shown as byte |
aws.elasticache.network_bytes_out (count) | The number of bytes the host has written to the network. Shown as byte |
aws.elasticache.network_packets_in (count) | The number of packets received on all network interfaces by the instance. Shown as packet |
aws.elasticache.network_packets_out (count) | The number of packets sent out on all network interfaces by the instance. Shown as packet |
aws.elasticache.new_connections (count) | Redis - The total number of connections that have been accepted by the server during this period. Memcached - The number of new connections the cache has received. This is derived from the memcached total_connections statistic by recording the change in total_connections across a period of time. This will always be at least 1, due to a connection reserved for a ElastiCache. Shown as connection |
aws.elasticache.new_items (count) | Memcached - The number of new items the cache has stored. This is derived from the memcached total_items statistic by recording the change in total_items across a period of time. Shown as item |
aws.elasticache.node_count (count) | The number of Elasticache nodes. Shown as node |
aws.elasticache.reclaimed (count) | Redis - The total number of key expiration events. Memcached - The number of expired items the cache evicted to allow space for new writes. |
aws.elasticache.replication_bytes (gauge) | Redis - For primaries with attached replicas, ReplicationBytes reports the number of bytes that the primary is sending to all of its replicas. This metric is representative of the write load on the replication group. For replicas and standalone primaries, ReplicationBytes is always 0. Shown as byte |
aws.elasticache.replication_lag (gauge) | Redis - This metric is only applicable for a cache node running as a read replica. It represents how far behind, in seconds, the replica is in applying changes from the primary cache cluster. Shown as second |
aws.elasticache.save_in_progress (gauge) | Redis - This binary metric returns 1 whenever a background save (forked or forkless) is in progress, and 0 otherwise. A background save process is typically used during snapshots and syncs. These operations can cause degraded performance. Using the SaveInProgress metric, you can diagnose whether or not degraded performance was caused by a background save process. |
aws.elasticache.set_based_cmds_latency (gauge) | Redis - The latency of set-based commands. Shown as microsecond |
aws.elasticache.set_based_cmds (count) | Redis - The total number of commands that are set-based. This is derived from the Redis commandstats statistic by summing all of the commands that act upon one or more sets. Shown as command |
aws.elasticache.set_type_cmds_latency (gauge) | Redis - The latency of write commands. Shown as microsecond |
aws.elasticache.set_type_cmds (count) | Redis - The total number of set types of commands. This is derived from the Redis commandstats statistic by summing all of the set types of commands (set, hset, etc.) Shown as command |
aws.elasticache.slabs_moved (count) | Memcached - The total number of slab pages that have been moved. Shown as page |
aws.elasticache.sorted_set_based_cmds_latency (gauge) | Redis - The latency of sorted-based commands. Shown as microsecond |
aws.elasticache.sorted_set_based_cmds (count) | Redis - The total number of commands that are sorted set-based. This is derived from the Redis commandstats statistic by summing all of the commands that act upon one or more sorted sets. Shown as command |
aws.elasticache.stream_based_cmds (count) | Redis - The total number of commands that are stream-based. Shown as command |
aws.elasticache.stream_based_cmds_latency (gauge) | Redis - The latency of stream-based commands. Shown as microsecond |
aws.elasticache.string_based_cmds_latency (gauge) | Redis - The latency of string-based commands. Shown as microsecond |
aws.elasticache.string_based_cmds (count) | Redis - The total number of commands that are string-based. This is derived from the Redis commandstats statistic by summing all of the commands that act upon one or more strings. Shown as command |
aws.elasticache.swap_usage (gauge) | The amount of swap used on the host. Shown as byte |
aws.elasticache.touch_hits (count) | Memcached - The number of keys that have been touched and were given a new expiration time. Shown as hit |
aws.elasticache.touch_misses (count) | Memcached - The number of items that have been touched, but were not found. Shown as miss |
aws.elasticache.unused_memory (gauge) | Memcached - The amount of unused memory the cache can use to store items. This is derived from the memcached statistics limit_maxbytes and bytes by subtracting bytes from limit_maxbytes. Shown as byte |
AWS から取得される各メトリクスには、ホスト名やセキュリティ グループなど、AWS コンソールに表示されるのと同じタグが割り当てられます。
AWS ElastiCache インテグレーションには、クラスター、キャッシュセキュリティグループ、およびキャッシュパラメーターグループのイベントが含まれています。以下はイベントの例です。
AWS ElastiCache インテグレーションには、サービスのチェック機能は含まれません。
ご不明な点は、Datadog のサポートチームまでお問合せください。