Amazon RDS

RDS Dashboard

Overview

Amazon Relational Database Service (RDS) is a web service used to setup, operate, and scale a relational database in the cloud. Enable this integration to see all your RDS metrics in Datadog.

Note: Ensure the environment variable DD_SITE is set to your region outside of the code, , or set the variable in the code as follows:

DD_SITE = os.getenv("DD_SITE", default="")

There are three options for monitoring RDS instances: Standard, Enhanced, and Native. Review the full list of metrics before choosing a configuration as each metric is labeled with its corresponding configuration. In addition, review the information below to learn more about each configuration’s requirements and preset dashboard:

The standard integration requires enabling RDS under the Metric Collection tab of the AWS integration page. This allows you to receive metrics about your instance as often as your CloudWatch integration allows. All RDS Engine types are supported.

The preset dashboard for this integration includes the following metric information: connection, replication lag, read operations and latency, computer, RAM, write operations and latency, and disk metrics.

The enhanced integration requires additional configuration and is available for MySQL, Aurora, MariaDB, SQL Server, Oracle, and PostgreSQL engines. Additional metrics are available but an AWS Lambda is required to submit the metrics to Datadog. The higher granularity and additional required services may result in additional AWS charges.

The preset dashboard for this integration includes the following metric information: loads, uptime, CPU Utilization, tasks, memory, SWAP, network receive, network transmit, CPU used per process, memory used per process, disk ops, filesystem used (pct), tasks running, and system CPU utilization.

The native database integration is optional and available for MySQL, Aurora, MariaDB, SQL Server, and PostgreSQL engine types. To get the metrics from RDS and the ones from the native integration to match up, use the dbinstanceidentifier tag on the native integration based on the identifier you assign to the RDS instance. The RDS instances are automatically assigned the tag.

There are 3 preset dashboards available for this configuration: MySQL, Aurora, and PostgreSQL. Each dashboard includes the following metric information: query volume, disk I/O, connections, replication, and AWS resource metrics.

Note: These dashboard display metrics both from AWS CloudWatch and from the individual database engine itself. Enable one of the integrations, MySQL, Aurora, or PostgreSQL, for all integration metrics.

Setup

Installation

For the standard RDS integration, set up the Amazon Web Services integration first.

Enable enhanced monitoring for your RDS instance during instance creation or afterwards by choosing Modify under Instance Actions. It is recommended to choose 15 for Monitoring Granularity.

The following instructions use KMS and the Lambda Management Console to create an encrypted version of your Datadog API key which can only be used with the RDS Enhanced Monitoring Lambda function. If you already have an encrypted API key from another Lambda such as the Log Forwarder, see the Lambda function’s README for other options.

Create your KMS key

  1. Open the KMS home at https://console.aws.amazon.com/kms/home.
  2. Go into Customer managed keys.
  3. Choose Create Key.
  4. Enter an Alias for the key, such as lambda-datadog-key. Note: An alias cannot begin with aws. Aliases that begin with aws are reserved by Amazon Web Services to represent AWS-managed CMKs in your account.
  5. Add the appropriate administrators to determine who can administer this key.
  6. No roles need to be added.
  7. Save your KMS key.

Create your Lambda function

  1. From the Lambda Management Console, create a new Lambda Function. Your Lambda function must be in the same region as the KMS key you created.
  2. Choose Serverless Application Repository, search for and select Datadog-RDS-Enhanced
  3. Give the application a unique name.
  4. Paste the Id of the key created in the previous section in the KMSKeyId parameter and deploy.
  5. Once the application is deployed, open up the newly created Lambda Function (click on the function under “Resource”).
  6. Click on the Configuration tab and go to the Environment variables section. For the environment variable kmsEncryptedKeys, add in your Datadog API key in the full JSON format in the value field like follows: {"api_key":"<YOUR_API_KEY>"}.
  7. Open the Encryption configuration section and select Enable helpers for encryption in transit.
  8. In the KMS key to encrypt at rest section, select Use a customer master key and enter the same KMS key created earlier.
  9. Press the Encrypt button next to the JSON blob you just entered and in the popup, choose the same KMS key created earlier as well.
  10. Press Save.
  11. Create a new trigger using the RDSOSMetrics CloudWatch log group as the source.
  12. Give the filter a name and optional filter pattern and press Save.

When clicking on test button for your lambda function you might get this error:

{
    "stackTrace": [
        [
            "/var/task/lambda_function.py",
            109,
            "lambda_handler",
            "event = json.loads(gzip.GzipFile(fileobj=StringIO(event['awslogs']['data'].decode('base64'))).read())"
        ]
    ],
    "errorType": "KeyError",
    "errorMessage": "'awslogs'"
}

This can be ignored. The Test button doesn’t work with this setup.

  1. Navigate to the AWS Console and open the RDS section to find the instance you want to monitor.
RDS console
  1. Make note of the endpoint URL, for example: mysqlrds.blah.us-east1.rds.amazonaws.com:3306, which is used to configure the Agent. Also make a note of the DB Instance identifier, for example: mysqlrds, which is used to create graphs and dashboards.

Configuration

  1. In the AWS integration page, ensure that RDS is enabled under the Metric Collection tab.

  2. Add the following permissions to your Datadog IAM policy in order to collect Amazon RDS metrics. For more information, see the RDS policies on the AWS website.

    AWS PermissionDescription
    rds:DescribeDBInstancesDescribe RDS instances to add tags.
    rds:ListTagsForResourceAdd custom tags on RDS instances.
    rds:DescribeEventsAdd events related to RDS databases.
  3. Install the Datadog - Amazon RDS integration.

  1. In the AWS integration page, ensure that RDS is enabled under the Metric Collection tab.

  2. Add the following permissions to your Datadog IAM policy in order to collect Amazon RDS metrics. For more information, see the RDS policies on the AWS website.

    AWS PermissionDescription
    rds:DescribeDBInstancesDescribe RDS instances to add tags.
    rds:ListTagsForResourceAdd custom tags on RDS instances.
    rds:DescribeEventsAdd events related to RDS databases.
  3. Install the Datadog - Amazon RDS integration.

Configure an Agent and connect to your RDS instance by editing the appropriate yaml file in your conf.d directory and then restart your Agent:

For RDS Aurora, edit the YAML file of the database flavor you are using.

If you are using MySQL or MariaDB, then edit mysql.yaml:

init_config:

instances:
    # The endpoint URL from the AWS console
    - server: 'mysqlrds.blah.us-east-1.rds.amazonaws.com'
      user: '<USERNAME>'
      pass: '<PASSWORD>'
      port: 3306
      tags:
          - 'dbinstanceidentifier:<INSTANCE_NAME>'

If you are using PostgreSQL, then edit postgres.yaml:

init_config:

instances:
    - host: 'postgresqlrds.blah.us-east-1.rds.amazonaws.com'
      port: 5432
      username: '<USERNAME>'
      password: '<PASSWORD>'
      dbname: '<DB_NAME>'
      tags:
          - 'dbinstanceidentifier:<DB_INSTANCE_NAME>'

If you are using Microsoft SQL Server, then edit sqlserver.yaml:

init_config:

instances:
    - host: 'sqlserverrds.blah.us-east-1.rds.amazonaws.com,1433'
      username: '<USERNAME>'
      password: '<PASSWORD>'
      tags:
          - 'dbinstanceidentifier:<DB_INSTANCE_NAME>'

Validation

Run the Agent’s status subcommand and look for something like this under the Checks section:

Checks
======

[...]

  mysql
  -----
      - instance #0 [OK]
      - Collected 8 metrics & 0 events

Usage

After a few minutes, RDS metrics and metrics from MySQL, Aurora, MariaDB, SQL Server, Oracle, or PostgreSQL are accessible in Datadog from the metrics explorer, dashboards, and alerts. Here’s an example of an Aurora dashboard displaying several metrics from both RDS and the MySQL integration. Metrics from both integrations on the instance quicktestrds are unified using the dbinstanceidentifier tag.

rds aurora dash

Log collection

Enable logging

It is possible to forward MySQL, MariaDB, and Postgres logs to Amazon CloudWatch. Follow instructions in Monitor Amazon Aurora MySQL, Amazon RDS for MySQL and MariaDB logs with Amazon CloudWatch to start sending your RDS logs to CloudWatch.

Send logs to Datadog

  1. If you haven’t already, set up the Datadog log collection AWS Lambda function.
  2. Once the Lambda function is installed, manually add a trigger on the CloudWatch log group that contains your RDS logs. Select the corresponding CloudWatch log group, add a filter name (optional), and add the trigger.

Once done, go in your Datadog Log section to explore your logs.

Data Collected

In addition to the metrics collected from the database engines, you also receive the following RDS metrics.

Metrics

aws.rds.active_transactions
(gauge)
The average rate of current transactions executing on a DB instance. Only available for Aurora MySQL DBs.
Shown as transaction
aws.rds.aurora_binlog_replica_lag
(gauge)
The amount of time a replica DB cluster running on Aurora with MySQL compatibility lags behind the source DB cluster. Only available for Aurora MySQL DBs.
Shown as second
aws.rds.aurora_replica_lag
(gauge)
The average lag when replicating updates from the primary instance. Only available for Aurora DBs.
Shown as millisecond
aws.rds.aurora_replica_lag_maximum
(gauge)
The maximum amount of lag between the primary instance and each Aurora instance in the DB cluster. Only available for Aurora DBs.
Shown as millisecond
aws.rds.aurora_replica_lag_minimum
(gauge)
The minimum amount of lag between the primary instance and each Aurora instance in the DB cluster. Only available for Aurora DBs.
Shown as millisecond
aws.rds.backup_retention_period_storage_used
(gauge)
The amount of backup storage used for storing continuous backups at the current time. Only available for Aurora DBs.
Shown as byte
aws.rds.bin_log_disk_usage
(gauge)
Amount of disk space occupied by binary logs on the master. Only available for non-Aurora DBs.
Shown as byte
aws.rds.blocked_transactions
(count)
The average rate of transactions in the database that are blocked. Only available for Aurora MySQL DBs.
Shown as transaction
aws.rds.buffer_cache_hit_ratio
(gauge)
The percentage of requests that are served by the Buffer cache. Only available for Aurora DBs.
Shown as percent
aws.rds.burst_balance
(gauge)
The percent of General Purpose SSD (gp2) burst-bucket I/O credits available. Only available for non-Aurora DBs.
Shown as percent
aws.rds.commit_latency
(gauge)
The amount of latency for committed transactions. Only available for Aurora DBs.
Shown as millisecond
aws.rds.commit_throughput
(rate)
The average rate of committed transactions. Only available for Aurora DBs.
Shown as transaction
aws.rds.cpucredit_balance
(gauge)
[T2 instances] Number of CPU credits that an instance has accumulated. Available for Aurora DBs.
aws.rds.cpucredit_usage
(gauge)
[T2 instances] Number of CPU credits consumed. Available for Aurora DBs.
aws.rds.cpusurplus_credit_balance
(gauge)
The number of surplus credits that have been spent by an unlimited instance when its CPUCreditBalance value is zero.
aws.rds.cpusurplus_credits_charged
(gauge)
The number of spent surplus credits that are not paid down by earned CPU credits, and which thus incur an additional charge.
aws.rds.cpuutilization
(gauge)
Percentage of CPU utilization. Recommended metric for standard monitoring. Available for Aurora DBs.
Shown as percent
aws.rds.cpuutilization.guest
(gauge)
The percentage of CPU in use by guest programs. (Enhanced)
Shown as percent
aws.rds.cpuutilization.idle
(gauge)
The percentage of CPU that is idle. (Enhanced)
Shown as percent
aws.rds.cpuutilization.irq
(gauge)
The percentage of CPU in use by software interrupts. (Enhanced)
Shown as percent
aws.rds.cpuutilization.kern
(gauge)
The percentage of CPU in use by the kernel. (Enhanced, SQL Server Only)
Shown as percent
aws.rds.cpuutilization.nice
(gauge)
The percentage of CPU in use by programs running at lowest priority. (Enhanced)
Shown as percent
aws.rds.cpuutilization.steal
(gauge)
The percentage of CPU in use by other virtual machines. (Enhanced)
Shown as percent
aws.rds.cpuutilization.system
(gauge)
The percentage of CPU in use by the kernel. (Enhanced)
Shown as percent
aws.rds.cpuutilization.total
(gauge)
The total percentage of the CPU in use. This value excludes the nice value. Recommended metric for enhanced monitoring. (Enhanced)
Shown as percent
aws.rds.cpuutilization.user
(gauge)
The percentage of CPU in use by user programs. (Enhanced)
Shown as percent
aws.rds.cpuutilization.wait
(gauge)
The percentage of CPU unused while waiting for I/O access. (Enhanced)
Shown as percent
aws.rds.database_connections
(gauge)
Number of database connections in use. Available for Aurora DBs.
Shown as connection
aws.rds.dbload
(gauge)
The number of active sessions for the DB engine (Performance Insights must be enabled).
Shown as session
aws.rds.dbload_cpu
(gauge)
The number of active sessions where the wait event type is CPU (Performance Insights must be enabled).
Shown as session
aws.rds.dbload_non_cpu
(gauge)
The number of active sessions where the wait event type is not CPU (Performance Insights must be enabled).
Shown as session
aws.rds.dbload_relative_to_num_vcpus
(gauge)
The ratio of the DB load to the number of virtual CPUs for the database.
Shown as percent
aws.rds.ddllatency
(gauge)
The amount of latency for DDL requests (create/alter/drop). Only available for Aurora MySQL DBs.
Shown as millisecond
aws.rds.ddlthroughput
(rate)
The average rate of DDL requests per second. Only available for Aurora MySQL DBs.
Shown as request
aws.rds.deadlocks
(count)
The average number of deadlocks in the database per second. Only available for Aurora DBs.
Shown as lock
aws.rds.delete_latency
(gauge)
The average latency for delete queries. Only available for Aurora MySQL DBs.
Shown as millisecond
aws.rds.delete_throughput
(rate)
The average rate of delete queries. Only available for Aurora MySQL DBs.
Shown as query
aws.rds.disk_queue_depth
(gauge)
Number of outstanding IOs (read/write requests) waiting to access the disk. Available for Aurora DBs.
Shown as request
aws.rds.diskio.avgQueueLen
(gauge)
The number of requests waiting in the I/O device's queue. This metric is not available for Amazon Aurora. (Enhanced)
Shown as request
aws.rds.diskio.avgReqSz
(gauge)
The average request size. This metric is not available for Amazon Aurora. (Enhanced)
Shown as kibibyte
aws.rds.diskio.await
(gauge)
The number of milliseconds required to respond to requests including queue time and service time. This metric is not available for Amazon Aurora. (Enhanced)
Shown as millisecond
aws.rds.diskio.readIOsPS
(rate)
The rate of read operations. (Enhanced)
Shown as operation
aws.rds.diskio.readKb
(gauge)
The total amount of data read. This metric is not available for Amazon Aurora. (Enhanced)
Shown as kibibyte
aws.rds.diskio.readKbPS
(rate)
The rate that data is read. This metric is not available for Amazon Aurora. (Enhanced)
Shown as kibibyte
aws.rds.diskio.rrqmPS
(rate)
The rate of merged read requests queue. This metric is not available for Amazon Aurora. (Enhanced)
Shown as request
aws.rds.diskio.tps
(rate)
The rate of I/O transactions. This metric is not available for Amazon Aurora. (Enhanced)
Shown as transaction
aws.rds.diskio.util
(gauge)
The percentage of CPU time during which requests were issued. The percentage of CPU time during which requests were issued. (Enhanced)
Shown as percent
aws.rds.diskio.writeIOsPS
(rate)
The rate of write operations. (Enhanced)
Shown as operation
aws.rds.diskio.writeKb
(gauge)
The total amount of data written. This metric is not available for Amazon Aurora. (Enhanced)
Shown as kibibyte
aws.rds.diskio.writeKbPS
(rate)
The rate that data is written. This metric is not available for Amazon Aurora. (Enhanced)
Shown as kibibyte
aws.rds.diskio.wrqmPS
(rate)
The rate of merged write requests queue. This metric is not available for Amazon Aurora. (Enhanced)
Shown as request
aws.rds.dmllatency
(gauge)
The average latency for inserts and updates and deletes. Only available for Aurora MySQL DBs.
Shown as millisecond
aws.rds.dmlthroughput
(rate)
The average rate of inserts and updates and deletes. Only available for Aurora MySQL DBs.
Shown as operation
aws.rds.engine_uptime
(gauge)
The amount of time that the DB instance has been active. Only available for Aurora DBs.
Shown as second
aws.rds.failed_sqlserver_agent_jobs_count
(count)
The number of failed SQL Server Agent jobs during the last minute.
Shown as minute
aws.rds.filesystem.maxFiles
(gauge)
The maximum number of files that can be created for the file system. (Enhanced)
Shown as file
aws.rds.filesystem.total
(gauge)
The total amount of disk space available for the file system. (Enhanced)
Shown as kibibyte
aws.rds.filesystem.used
(gauge)
The amount of disk space used by files in the file system. (Enhanced)
Shown as kibibyte
aws.rds.filesystem.usedFilePercent
(gauge)
The percentage of available files in use. (Enhanced)
Shown as percent
aws.rds.filesystem.usedFiles
(gauge)
The number of files in the file system. (Enhanced)
Shown as file
aws.rds.filesystem.usedPercent
(gauge)
The percentage of the file-system disk space in use. (Enhanced)
Shown as percent
aws.rds.free_local_storage
(gauge)
The amount of local storage that is free on an instance. Only available for Aurora DBs.
Shown as byte
aws.rds.free_storage_space
(gauge)
Amount of available storage space.
Shown as byte
aws.rds.freeable_memory
(gauge)
Amount of available random access memory. Available for Aurora DBs.
Shown as byte
aws.rds.insert_latency
(gauge)
The amount of latency for insert queries. Only available for Aurora MySQL DBs.
Shown as millisecond
aws.rds.insert_throughput
(rate)
The average rate of insert queries. Only available for Aurora MySQL DBs.
Shown as query
aws.rds.load.1
(gauge)
The number of processes requesting CPU time over the last minute. (Enhanced)
Shown as process
aws.rds.load.15
(gauge)
The number of processes requesting CPU time over the last 15 minutes. (Enhanced)
Shown as process
aws.rds.load.5
(gauge)
The number of processes requesting CPU time over the last 5 minutes. (Enhanced)
Shown as process
aws.rds.login_failures
(count)
The average number of failed login attempts per second. Only available for Aurora MySQL DBs.
Shown as operation
aws.rds.maximum_used_transaction_ids
(count)
The maximum transaction ID that has been used. Only available for Aurora PostgreSQL DBs.
aws.rds.memory.active
(gauge)
The amount of assigned memory. (Enhanced)
Shown as kibibyte
aws.rds.memory.buffers
(gauge)
The amount of memory used for buffering I/O requests prior to writing to the storage device. (Enhanced)
Shown as kibibyte
aws.rds.memory.cached
(gauge)
The amount of memory used for caching file system-based I/O. (Enhanced)
Shown as kibibyte
aws.rds.memory.commitLimitKb
(gauge)
The maximum possible value for the commitTotKb metric. This value is the sum of the current pagefile size plus the physical memory available for pageable contents–excluding RAM that is assigned to non-pageable areas. (Enhanced, SQL Server Only)
Shown as kibibyte
aws.rds.memory.commitPeakKb
(gauge)
The largest value of the commitTotKb metric since the operating system was last started. (Enhanced, SQL Server Only)
Shown as kibibyte
aws.rds.memory.commitTotKb
(gauge)
The amount of pagefile-backed virtual address space in use, that is, the current commit charge. This value is composed of main memory (RAM) and disk (pagefiles). (Enhanced, SQL Server Only)
Shown as kibibyte
aws.rds.memory.dirty
(gauge)
The amount of memory pages in RAM that have been modified but not written to their related data block in storage. (Enhanced)
Shown as kibibyte
aws.rds.memory.free
(gauge)
The amount of unassigned memory. (Enhanced)
Shown as kibibyte
aws.rds.memory.hugePagesFree
(gauge)
The number of free huge pages. (Enhanced)
Shown as page
aws.rds.memory.hugePagesRsvd
(gauge)
The number of committed huge pages. (Enhanced)
Shown as page
aws.rds.memory.hugePagesSize
(gauge)
The size for each huge pages unit. (Enhanced)
Shown as kibibyte
aws.rds.memory.hugePagesSurp
(gauge)
The number of available surplus huge pages over the total. (Enhanced)
Shown as page
aws.rds.memory.hugePagesTotal
(gauge)
The total number of huge pages for the system. (Enhanced)
Shown as page
aws.rds.memory.inactive
(gauge)
The amount of inactive memory (Enhanced)
Shown as kibibyte
aws.rds.memory.kernNonpagedKb
(gauge)
The amount of memory in the non-paged kernel pool. (Enhanced, SQL Server Only)
Shown as kibibyte
aws.rds.memory.kernPagedKb
(gauge)
The amount of memory in the paged kernel pool. (Enhanced, SQL Server Only)
Shown as kibibyte
aws.rds.memory.kernTotKb
(gauge)
The sum of the memory in the paged and non-paged kernel pools. (Enhanced, SQL Server Only)
Shown as kibibyte
aws.rds.memory.mapped
(gauge)
The total amount of file-system contents that is memory mapped inside a process address space. (Enhanced)
Shown as kibibyte
aws.rds.memory.pageSize
(gauge)
The size of a page. (Enhanced, SQL Server Only)
Shown as byte
aws.rds.memory.pageTables
(gauge)
The amount of memory used by page tables. (Enhanced)
Shown as kibibyte
aws.rds.memory.physAvailKb
(gauge)
The amount of available physical memory. (Enhanced, SQL Server Only)
Shown as kibibyte
aws.rds.memory.physTotKb
(gauge)
The amount of physical memory. (Enhanced, SQL Server Only)
Shown as kibibyte
aws.rds.memory.slab
(gauge)
The amount of reusable kernel data structures. (Enhanced)
Shown as kibibyte
aws.rds.memory.sqlServerTotKb
(gauge)
The amount of memory committed to Microsoft SQL Server. (Enhanced, SQL Server Only)
Shown as kibibyte
aws.rds.memory.sysCacheKb
(gauge)
The amount of system cache memory. (Enhanced, SQL Server Only)
Shown as kibibyte
aws.rds.memory.total
(gauge)
The total amount of memory. (Enhanced)
Shown as kibibyte
aws.rds.memory.writeback
(gauge)
The amount of dirty pages in RAM that are still being written to the backing storage. (Enhanced)
Shown as kibibyte
aws.rds.network.rdBytesPS
(gauge)
The number of bytes received per second. (Enhanced, SQL Server Only)
Shown as byte
aws.rds.network.rx
(gauge)
The number of packets received. (Enhanced)
Shown as packet
aws.rds.network.tx
(gauge)
The number of packets uploaded. (Enhanced)
Shown as packet
aws.rds.network.wrBytesPS
(gauge)
The number of bytes sent per second. (Enhanced, SQL Server Only)
Shown as byte
aws.rds.network_receive_throughput
(rate)
Incoming (Receive) network traffic on the DB instance. Available for Aurora DBs.
Shown as byte
aws.rds.network_throughput
(rate)
The rate of network throughput sent and received from clients by each instance in the DB cluster. Only available for Aurora DBs.
Shown as byte
aws.rds.network_transmit_throughput
(rate)
Outgoing (Transmit) network traffic on the DB instance. Available for Aurora DBs.
Shown as byte
aws.rds.oldest_replication_slot_lag
(gauge)
The lagging size of the replica lagging the most in terms of WAL data received. Only available for Aurora PostgreSQL DBs.
Shown as byte
aws.rds.process.cpuUsedPc
(gauge)
The percentage of CPU used by the process. (Enhanced)
Shown as percent
aws.rds.process.memUsedPc
(gauge)
The percentage of total memory used by the process. (Enhanced, SQL Server Only)
Shown as percent
aws.rds.process.memoryUsedPc
(gauge)
The percentage of memory used by the process. (Enhanced)
Shown as percent
aws.rds.process.parentID
(gauge)
The process identifier for the parent proces of the process. (Enhanced)
aws.rds.process.pid
(gauge)
The identifier of the process. This value is not present for processes that are owned by Amazon RDS. (Enhanced, SQL Server Only)
aws.rds.process.ppid
(gauge)
The process identifier for the parent of this process. This value is only present for child processes. (Enhanced, SQL Server Only)
aws.rds.process.rss
(gauge)
The amount of RAM allocated to the process. (Enhanced)
Shown as kibibyte
aws.rds.process.tgid
(gauge)
The thread group identifier which is a number representing the process ID to which a thread belongs. This identifier is used to group threads from the same process. (Enhanced)
aws.rds.process.tid
(gauge)
The thread identifier. This value is only present for threads. The owning process can be identified by using the pid value. (Enhanced, SQL Server Only)
aws.rds.process.virtKb
(gauge)
The amount of virtual address space the process is using. Use of virtual address space does not necessarily imply corresponding use of either disk or main memory pages. (Enhanced, SQL Server Only)
Shown as kibibyte
aws.rds.process.vss
(gauge)
The amount of virtual memory allocated to the process. (Enhanced)
Shown as kibibyte
aws.rds.process.workingSetKb
(gauge)
The amount of memory in the private working set plus the amount of memory that is in use by the process and can be shared with other processes. (Enhanced, SQL Server Only)
Shown as kibibyte
aws.rds.process.workingSetPrivKb
(gauge)
The amount of memory that is in use by a process, but can't be shared with other processes. (Enhanced, SQL Server Only)
Shown as kibibyte
aws.rds.process.workingSetShareableKb
(gauge)
The amount of memory that is in use by a process and can be shared with other processes. (Enhanced, SQL Server Only)
Shown as kibibyte
aws.rds.queries
(rate)
The average rate of queries. Only available for Aurora MySQL DBs.
Shown as query
aws.rds.rdsto_aurora_postgre_sqlreplica_lag
(gauge)
The amount of lag in seconds when replicating updates from the primary RDS PostgreSQL instance to other nodes in the cluster. Only available for Aurora PostgreSQL DBs.
Shown as second
aws.rds.read_iops
(rate)
Average number of disk read I/O operations.
Shown as operation
aws.rds.read_latency
(gauge)
Average amount of time taken per disk read I/O operation.
Shown as second
aws.rds.read_throughput
(rate)
Average number of bytes read from disk.
Shown as byte
aws.rds.replica_lag
(gauge)
Amount of time a Read Replica DB Instance lags behind the source DB Instance.
Shown as second
aws.rds.replication_slot_disk_usage
(gauge)
The disk space used by replication slot files. Available for Aurora PostgreSQL DBs.
Shown as byte
aws.rds.result_set_cache_hit_ratio
(gauge)
The percentage of requests that are served by the Resultset cache. Only available for Aurora MySQL DBs.
Shown as percent
aws.rds.select_latency
(gauge)
The average latency for select queries. Only available for Aurora MySQL DBs.
Shown as millisecond
aws.rds.select_throughput
(rate)
The average rate of select queries. Only available for Aurora MySQL DBs.
Shown as query
aws.rds.snapshot_storage_used
(gauge)
The amount of backup storage used for storing manual snapshots beyond the backup retention period. Only available for Aurora DBs.
Shown as byte
aws.rds.swap.cached
(gauge)
The amount of swap memory used as cache memory. (Enhanced)
Shown as kibibyte
aws.rds.swap.free
(gauge)
The total amount of swap memory free. (Enhanced)
Shown as kibibyte
aws.rds.swap.in
(gauge)
The amount of memory swapped in from disk. (Enhanced)
Shown as kibibyte
aws.rds.swap.out
(gauge)
The amount of memory swapped out from disk. (Enhanced)
Shown as kibibyte
aws.rds.swap.total
(gauge)
The total amount of swap memory available. (Enhanced)
Shown as kibibyte
aws.rds.swap_usage
(gauge)
Amount of swap space used on the DB Instance. Available for Aurora DBs.
Shown as byte
aws.rds.tasks.blocked
(gauge)
The number of tasks that are blocked. (Enhanced)
Shown as task
aws.rds.tasks.running
(gauge)
The number of tasks that are running. (Enhanced)
Shown as task
aws.rds.tasks.sleeping
(gauge)
The number of tasks that are sleeping. (Enhanced)
Shown as task
aws.rds.tasks.stopped
(gauge)
The number of tasks that are stopped. (Enhanced)
Shown as task
aws.rds.tasks.total
(gauge)
The total number of tasks. (Enhanced)
Shown as task
aws.rds.tasks.zombie
(gauge)
The number of child tasks that are inactive with an active parent task. (Enhanced)
Shown as task
aws.rds.total_backup_storage_billed
(gauge)
The sum of BackupRetentionPeriodStorageUsed and SnapshotStorageUsed minus an amount of free backup storage which equals the size of the cluster volume for one day. Only available for Aurora DBs.
Shown as byte
aws.rds.total_storage_space
(gauge)
Total amount of storage available on an instance.
Shown as byte
aws.rds.transaction_logs_disk_usage
(gauge)
Amount of disk space occupied by transaction logs. Only available for Aurora PostgreSQL DBs.
Shown as byte
aws.rds.transaction_logs_generation
(gauge)
The size of transaction logs generated per second.
Shown as byte
aws.rds.update_latency
(gauge)
The average latency for update queries. Only available for Aurora MySQL DBs.
Shown as millisecond
aws.rds.update_throughput
(rate)
The average rate of update queries. Only available for Aurora MySQL DBs.
Shown as query
aws.rds.uptime
(gauge)
RDS instance uptime. (Enhanced)
Shown as second
aws.rds.virtual_cpus
(gauge)
The number of virtual CPUs for the DB instance. (Enhanced)
Shown as cpu
aws.rds.volume_bytes_used
(gauge)
The amount of storage in bytes used by your Aurora database. Only available for Aurora DBs.
Shown as byte
aws.rds.volume_read_iops
(count)
The number of billed read I/O operations from a cluster volume, reported at 5-minute intervals. Only available for Aurora DBs.
Shown as operation
aws.rds.volume_write_iops
(count)
The average number of write disk I/O operations to the cluster volume reported at 5-minute intervals. Only available for Aurora DBs.
Shown as operation
aws.rds.write_iops
(rate)
Average number of disk write I/O operations per second.
Shown as operation
aws.rds.write_latency
(gauge)
Average amount of time taken per disk write I/O operation.
Shown as second
aws.rds.write_throughput
(rate)
Average number of bytes written.
Shown as byte

Each of the metrics retrieved from AWS is assigned the same tags that appear in the AWS console, including but not limited to host name, security-groups, and more.

Events

The Amazon RDS integration includes events related to DB instances, security groups, snapshots, and parameter groups. See example events below:

Amazon RDS Events

Service Checks

aws.rds.read_replica_status Monitors the read replication status. This check returns one of the following statuses:

  • OK - replicating or connecting
  • CRITICAL - error or terminated
  • WARNING - stopped
  • UNKNOWN - other

Out-of-the-box monitoring

The Amazon RDS integration provides ready-to-use monitoring capabilities to monitor and optimize performance.

Troubleshooting

Need help? Contact Datadog support.

Further Reading