Network Performance Monitoring is now generally available! Network Monitoring is now available!

Amazon Elasticsearch

Crawler Crawler

Overview

Amazon Elasticsearch Service is a managed service that makes it easy to deploy, operate, and scale Elasticsearch in the AWS Cloud.

Enable this integration to see custom tags and metrics for your ES clusters in Datadog.

Setup

Installation

If you haven’t already, set up the Amazon Web Services integration first.

Metric collection

  1. In the AWS integration tile, ensure that ES is checked under metric collection.

  2. Add those permissions to your Datadog IAM policy in order to collect Amazon ES metrics:

    • es:ListTags: Adds custom ES domain tags to ES metrics
    • es:ListDomainNames: Lists all Amazon ES domains owned by the current user in the active region.
    • es:DescribeElasticsearchDomains: Collects the domain ID, domain service endpoint, and domain ARN for all Domains as tags.

    For more information on ES policies, review the documentation on the AWS website.

  3. Install the Datadog - AWS ES integration.

Log collection

Enable logging

Configure Amazon Elasticsearch to send logs either to a S3 bucket or to Cloudwatch.

Note: If you log to a S3 bucket, make sure that amazon_elasticsearch is set as Target prefix.

Send logs to Datadog

  1. If you haven’t already, set up the Datadog log collection AWS Lambda function.
  2. Once the lambda function is installed, manually add a trigger on the S3 bucket or Cloudwatch log group that contains your Amazon Elasticsearch logs in the AWS console:

Data Collected

Metrics

aws.es.2xx
(count)
The number of requests to a domain with the HTTP response code 2xx.
Shown as request
aws.es.2xx.average
(gauge)
The average number of requests to a domain with the HTTP response code 2xx.
Shown as request
aws.es.3xx
(count)
The number of requests to a domain with the HTTP response code 3xx.
Shown as request
aws.es.3xx.average
(gauge)
The average number of requests to a domain with the HTTP response code 3xx.
Shown as request
aws.es.4xx
(count)
The number of requests to a domain with the HTTP response code 4xx.
Shown as request
aws.es.4xx.average
(gauge)
The average number of requests to a domain with the HTTP response code 4xx.
Shown as request
aws.es.5xx
(count)
The number of requests to a domain with the HTTP response code 5xx.
Shown as request
aws.es.5xx.average
(gauge)
The average number of requests to a domain with the HTTP response code 5xx.
Shown as request
aws.es.alerting_degraded
(gauge)
Indicates whether the ES alerting service is degraded. A value of 0 means 'No'. A value of 1 means 'Yes'.
aws.es.automated_snapshot_failure
(gauge)
The number of failed automated snapshots for the cluster.
Shown as error
aws.es.automated_snapshot_failure.maximum
(gauge)
The maximum number of failed automated snapshots for the cluster.
Shown as error
aws.es.automated_snapshot_failure.minimum
(gauge)
The minimum number of failed automated snapshots for the cluster.
Shown as error
aws.es.cluster_index_writes_blocked
(gauge)
Indicates whether your cluster is accepting or blocking incoming write requests. A value of 0 means that the cluster is accepting requests. A value of 1 means that it is blocking requests.
aws.es.cluster_statusgreen
(gauge)
Indicates whether all index shards are allocated to nodes in the cluster.
aws.es.cluster_statusgreen.maximum
(gauge)
Indicates maximum of index shards allocated to nodes in the cluster.
aws.es.cluster_statusgreen.minimum
(gauge)
Indicates minimum of index shards allocated to nodes in the cluster.
aws.es.cluster_statusred
(gauge)
Indicates whether both primary and replica shards of at least one index are not allocated to nodes in a cluster.
aws.es.cluster_statusred.maximum
(gauge)
Indicates maximum of whether both primary and replica shards of at least one index are not allocated to nodes in a cluster.
aws.es.cluster_statusred.minimum
(gauge)
Indicates minimum of whether both primary and replica shards of at least one index are not allocated to nodes in a cluster.
aws.es.cluster_statusyellow
(gauge)
Indicates whether replica shards are not allocated to nodes in a cluster.
aws.es.cluster_statusyellow.maximum
(gauge)
Indicates the maximum of whether replica shards are not allocated to nodes in a cluster.
aws.es.cluster_statusyellow.minimum
(gauge)
Indicates the minimum of whether replica shards are not allocated to nodes in a cluster.
aws.es.cluster_used_space
(gauge)
The total used space, in GiB, for the cluster.
Shown as gibibyte
aws.es.cluster_used_space.maximum
(gauge)
The maximum used space, in GiB, for the cluster.
Shown as gibibyte
aws.es.cluster_used_space.minimum
(gauge)
The minimum used space, in GiB, for the cluster.
Shown as gibibyte
aws.es.cpucredit_balance
(gauge)
The remaining CPU credits available for data nodes in the cluster.
aws.es.cpuutilization
(gauge)
The average percentage of CPU resources used across all nodes in the cluster.
Shown as percent
aws.es.cpuutilization.maximum
(gauge)
The maximum percentage of CPU resources used by any node in the cluster.
Shown as percent
aws.es.cpuutilization.minimum
(gauge)
The minimum percentage of CPU resources used by any node in the cluster.
Shown as percent
aws.es.deleted_documents
(gauge)
The total number of documents marked for deletion across all indices in the cluster.
Shown as document
aws.es.deleted_documents.maximum
(gauge)
The maximum number of documents marked for deletion across all indices in the cluster.
Shown as document
aws.es.deleted_documents.minimum
(gauge)
The minimum number of documents marked for deletion across all indices in the cluster.
Shown as document
aws.es.disk_queue_depth
(gauge)
The average number of pending input and output (I/O) requests for an EBS volume. across all nodes in the cluster
Shown as request
aws.es.disk_queue_depth.maximum
(gauge)
The maximum number for any node in the cluster of pending input and output (I/O) requests for an EBS volume.
Shown as request
aws.es.disk_queue_depth.minimum
(gauge)
The minimum number for any node in the cluster of pending input and output (I/O) requests for an EBS volume.
Shown as request
aws.es.elasticsearch_requests
(count)
The number of requests made to the Elasticsearch cluster.
Shown as request
aws.es.elasticsearch_requests.average
(gauge)
The average number of requests made to the Elasticsearch cluster.
Shown as request
aws.es.free_storage_space
(gauge)
The average free space, in megabytes, across all the data nodes in a cluster.
Shown as mebibyte
aws.es.free_storage_space.maximum
(gauge)
The free space, in megabytes, for the single data node with the most available free space in a cluster.
Shown as mebibyte
aws.es.free_storage_space.minimum
(gauge)
The free space, in megabytes, for the single data node with the least available free space in a cluster.
Shown as mebibyte
aws.es.free_storage_space.sum
(gauge)
The free space, in megabytes, for all data nodes in the cluster.
Shown as mebibyte
aws.es.indexing_latency
(gauge)
The average time, in milliseconds, that it takes a shard to complete an indexing operation.
Shown as millisecond
aws.es.indexing_rate
(count)
The number of indexing operations per minute.
Shown as operation
aws.es.invalid_host_header_requests
(count)
The number of HTTP requests made to the Elasticsearch cluster that included an invalid (or missing) host header.
Shown as request
aws.es.invalid_host_header_requests.average
(gauge)
The average number of HTTP requests made to the Elasticsearch cluster that included an invalid (or missing) host header.
Shown as request
aws.es.jvmgcold_collection_count
(count)
The number of times that 'old generation' garbage collection has run. In a cluster with sufficient resources, this number should remain small and grow infrequently.
Shown as garbage collection
aws.es.jvmgcold_collection_time
(gauge)
The amount of time, in milliseconds, that the cluster has spent performing 'old generation' garbage collection.
Shown as millisecond
aws.es.jvmgcyoung_collection_count
(count)
The number of times that 'young generation' garbage collection has run. A large, ever-growing number of runs is a normal part of cluster operations.
Shown as garbage collection
aws.es.jvmgcyoung_collection_time
(gauge)
The amount of time, in milliseconds, that the cluster has spent performing 'young generation' garbage collection.
Shown as millisecond
aws.es.jvmmemory_pressure
(gauge)
The average percentage of the Java heap used for all data nodes in the cluster.
Shown as percent
aws.es.jvmmemory_pressure.maximum
(gauge)
The maximum percentage of the Java heap used by any data node in the cluster.
Shown as percent
aws.es.jvmmemory_pressure.minimum
(gauge)
The minimum percentage of the Java heap used by any data node in the cluster.
Shown as percent
aws.es.kibana_healthy_nodes
(gauge)
A health check for Kibana. A value of 1 indicates normal behavior. A value of 0 indicates that Kibana is inaccessible.
aws.es.master_cpucredit_balance
(gauge)
The remaining CPU credits available for dedicated master nodes in the cluster.
aws.es.master_cpuutilization
(gauge)
The maximum percentage of CPU resources used by the dedicated master nodes.
Shown as percent
aws.es.master_free_storage_space
(gauge)
This metric is not relevant and can be ignored. The service does not use master nodes as data nodes.
Shown as mebibyte
aws.es.master_jvmmemory_pressure
(gauge)
The maximum percentage of the Java heap used for all dedicated master nodes in the cluster.
Shown as percent
aws.es.master_reachable_from_node
(gauge)
A health check for MasterNotDiscovered exceptions. A value of 1 indicates normal behavior. A value of 0 indicates that /_cluster/health/ is failing.
aws.es.master_sys_memory_utilization
(gauge)
The percentage of the instance's memory that is in use.
Shown as percent
aws.es.nodes
(gauge)
The number of nodes in the Amazon ES cluster.
Shown as node
aws.es.nodes.maximum
(gauge)
The maximum number of nodes in the Amazon ES cluster.
Shown as node
aws.es.nodes.minimum
(gauge)
The minimum number of nodes in the Amazon ES cluster.
Shown as node
aws.es.read_iops
(gauge)
The number of input and output (I/O) operations per second for read operations on EBS volumes.
Shown as operation
aws.es.read_iops.maximum
(gauge)
The maximum number for any node of input and output (I/O) operations per second for read operations on EBS volumes.
Shown as operation
aws.es.read_iops.minimum
(gauge)
The minimum number for any node of input and output (I/O) operations per second for read operations on EBS volumes.
Shown as operation
aws.es.read_latency
(gauge)
The latency, in seconds, for read operations on EBS volumes.
Shown as second
aws.es.read_latency.maximum
(gauge)
The maximum latency for any node, in seconds, for read operations on EBS volumes.
Shown as second
aws.es.read_latency.minimum
(gauge)
The minimum latency for any node, in seconds, for read operations on EBS volumes.
Shown as second
aws.es.read_throughput
(gauge)
The throughput, in bytes per second, for read operations on EBS volumes.
Shown as byte
aws.es.read_throughput.maximum
(gauge)
The maximum throughput for any node, in bytes per second, for read operations on EBS volumes.
Shown as byte
aws.es.read_throughput.minimum
(gauge)
The minimum throughput for any node, in bytes per second, for read operations on EBS volumes.
Shown as byte
aws.es.search_latency
(gauge)
The average time, in milliseconds, that it takes a shard to complete a search operation.
Shown as millisecond
aws.es.search_rate
(count)
The total number of search requests per minute for all shards on a node.
Shown as request
aws.es.searchable_documents
(gauge)
The total number of searchable documents across all indices in the cluster.
Shown as document
aws.es.searchable_documents.maximum
(gauge)
The maximum number of searchable documents across all indices in the cluster.
Shown as document
aws.es.searchable_documents.minimum
(gauge)
The minimum number of searchable documents across all indices in the cluster.
Shown as document
aws.es.sys_memory_utilization
(gauge)
The percentage of the instance's memory that is in use.
Shown as percent
aws.es.sys_memory_utilization.maximum
(gauge)
The maximum percentage of the instance's memory that is in use.
Shown as percent
aws.es.sys_memory_utilization.minimum
(gauge)
The minimum percentage of the instance's memory that is in use.
Shown as percent
aws.es.threadpool_bulk_queue
(count)
The number of queued tasks in the bulk thread pool.
Shown as task
aws.es.threadpool_bulk_rejected
(count)
The number of rejected tasks in the bulk thread pool.
Shown as task
aws.es.threadpool_bulk_threads
(gauge)
The size of the bulk thread pool.
aws.es.threadpool_forcemerge_queue
(count)
The number of queued tasks in the force merge thread pool.
Shown as task
aws.es.threadpool_forcemerge_rejected
(count)
The number of rejected tasks in the force merge thread pool.
Shown as task
aws.es.threadpool_forcemerge_threads
(gauge)
The size of the force merge thread pool.
aws.es.threadpool_index_queue
(count)
The number of queued tasks in the index thread pool.
Shown as task
aws.es.threadpool_index_rejected
(count)
The number of rejected tasks in the index thread pool.
Shown as task
aws.es.threadpool_index_threads
(gauge)
The size of the index thread pool.
aws.es.threadpool_merge_queue
(count)
The number of queued tasks in the merge thread pool.
Shown as task
aws.es.threadpool_merge_rejected
(count)
The number of rejected tasks in the merge thread pool.
Shown as task
aws.es.threadpool_merge_threads
(gauge)
The size of the merge thread pool.
aws.es.threadpool_search_queue
(count)
The number of queued tasks in the search thread pool.
Shown as task
aws.es.threadpool_search_rejected
(count)
The number of rejected tasks in the search thread pool.
Shown as task
aws.es.threadpool_search_threads
(gauge)
The size of the search thread pool.
aws.es.threadpool_write_queue
(count)
The number of queued tasks in the write thread pool.
Shown as task
aws.es.threadpool_write_rejected
(count)
The number of rejected tasks in the write thread pool.
Shown as task
aws.es.threadpool_write_threads
(gauge)
The size of the write thread pool.
aws.es.write_iops
(gauge)
The number of input and output (I/O) operations per second for write operations on EBS volumes.
Shown as operation
aws.es.write_iops.maximum
(gauge)
The maximum number for any node of input and output (I/O) operations per second for write operations on EBS volumes.
Shown as operation
aws.es.write_iops.minimum
(gauge)
The minimum number for any node of input and output (I/O) operations per second for write operations on EBS volumes.
Shown as operation
aws.es.write_latency
(gauge)
The latency, in seconds, for write operations on EBS volumes.
Shown as second
aws.es.write_latency.maximum
(gauge)
The maximum latency for any node, in seconds, for write operations on EBS volumes.
Shown as second
aws.es.write_latency.minimum
(gauge)
The minimum latency for any node, in seconds, for write operations on EBS volumes.
Shown as second
aws.es.write_throughput
(gauge)
The throughput, in bytes per second, for write operations on EBS volumes.
Shown as byte
aws.es.write_throughput.maximum
(gauge)
The maximum throughput for any node, in bytes per second, for write operations on EBS volumes.
Shown as byte
aws.es.write_throughput.minimum
(gauge)
The minimum throughput for any node, in bytes per second, for write operations on EBS volumes.
Shown as byte

Each of the metrics retrieved from AWS will be assigned the same tags that appear in the AWS console, including but not limited to host name, security-groups, and more.

Events

The AWS ES integration does not include any events.

Service Checks

The AWS ES integration does not include any service checks.

Troubleshooting

Need help? Contact Datadog support.