Get metrics from CoreDNS in real time to visualize and monitor DNS failures and cache hits/misses.
Follow the instructions below to install and configure this check for an Agent running on a host. For containerized environments, see the Autodiscovery Integration Templates for guidance on applying these instructions.
The CoreDNS check is included in the Datadog Agent package, so you don’t need to install anything else on your servers.
For containerized environments, see the Autodiscovery Integration Templates for guidance on applying the parameters below.
Parameter | Value |
---|---|
<INTEGRATION_NAME> | coredns |
<INIT_CONFIG> | blank or {} |
<INSTANCE_CONFIG> | {"prometheus_url":"http://%%host%%:9153/metrics", "tags":["dns-pod:%%host%%"]} |
Note:
dns-pod
tag keeps track of the target DNS pod IP. The other tags are related to the dd-agent that is polling the information using the service discovery.Collecting logs is disabled by default in the Datadog Agent. To enable it, see Kubernetes log collection documentation.
Parameter | Value |
---|---|
<LOG_CONFIG> | {"source": "coredns", "service": "<SERVICE_NAME>"} |
Run the Agent’s status
subcommand and look for coredns
under the Checks section.
coredns.acl.allowed_requests (count) | Counter of DNS requests being allowed. Shown as request |
coredns.acl.blocked_requests (count) | Counter of DNS requests being blocked. Shown as request |
coredns.autopath.success_count (count) | Counter of requests that did autopath. Shown as request |
coredns.build_info (gauge) | A metric with a constant '1' value labeled by version, revision, and goversion from which CoreDNS was built. |
coredns.response_code_count (count) | number of responses per zone and rcode |
coredns.proxy_request_count (count) | query count per upstream. Shown as request |
coredns.cache_drops_count (count) | Counter of responses excluded from the cache due to request/response question name mismatch. Shown as response |
coredns.cache_hits_count (count) | Counter of cache hits by cache type Shown as hit |
coredns.cache_misses_count (count) | Counter of cache misses. Shown as miss |
coredns.cache_prefetch_count (count) | The number of time the cache has prefetched a cached item. |
coredns.cache_stale_count (count) | Counter of requests served from stale cache entries. Shown as request |
coredns.dnssec.cache_size (gauge) | Total elements in the cache, type is signature. |
coredns.dnssec.cache_hits (count) | Counter of cache hits. Shown as hit |
coredns.dnssec.cache_misses (count) | Counter of cache misses. Shown as miss |
coredns.request_count (count) | total query count. Shown as request |
coredns.request_type_count (count) | counter of queries per zone and type |
coredns.request_duration.seconds.sum (gauge) | duration to process each query Shown as second |
coredns.request_duration.seconds.count (gauge) | duration to process each query Shown as second |
coredns.proxy_request_duration.seconds.sum (gauge) | duration per upstream interaction Shown as second |
coredns.proxy_request_duration.seconds.count (gauge) | duration per upstream interaction Shown as second |
coredns.forward_request_duration.seconds.sum (gauge) | duration per upstream interaction Shown as second |
coredns.forward_request_duration.seconds.count (gauge) | duration per upstream interaction Shown as second |
coredns.forward_request_count (count) | query count per upstream Shown as request |
coredns.forward_response_rcode_count (count) | count of RCODEs per upstream Shown as response |
coredns.forward_healthcheck_failure_count (count) | number of failed health checks per upstream Shown as entry |
coredns.forward_healthcheck_broken_count (count) | counter of when all upstreams are unhealthy Shown as entry |
coredns.forward_max_concurrent_rejects (count) | Counter of the number of queries rejected because the concurrent queries were at maximum. Shown as query |
coredns.forward_sockets_open (gauge) | number of sockets open per upstream Shown as connection |
coredns.grpc.request_count (count) | Query count per upstream. |
coredns.grpc.response_rcode_count (count) | Count of RCODEs per upstream. and we are randomly (this always uses the random policy) spraying to an upstream. |
coredns.health_request_duration.count (gauge) | Count for the histogram of the time (in seconds) each request took. |
coredns.health_request_duration.sum (gauge) | Sum for the histogram of the time (in seconds) each request took. |
coredns.hosts.entries_count (gauge) | The combined number of entries in hosts and Corefile. |
coredns.hosts.reload_timestamp (gauge) | The timestamp of the last reload of hosts file. Shown as second |
coredns.reload.failed_count (count) | Counts the number of failed reload attempts. |
coredns.request_size.bytes.sum (gauge) | size of the request in bytes Shown as byte |
coredns.request_size.bytes.count (gauge) | size of the request in bytes Shown as byte |
coredns.response_size.bytes.sum (gauge) | size of the request in bytes Shown as byte |
coredns.response_size.bytes.count (gauge) | size of the request in bytes Shown as byte |
coredns.cache_size.count (gauge) | Shown as entry |
coredns.panic_count.count (count) | Shown as entry |
coredns.go.gc_duration_seconds.count (gauge) | Count of the GC invocation durations. Shown as second |
coredns.go.gc_duration_seconds.sum (gauge) | Sum of the GC invocation durations. Shown as second |
coredns.go.gc_duration_seconds.quantile (gauge) | Quantiles of the GC invocation durations. Shown as second |
coredns.go.goroutines (gauge) | Number of goroutines that currently exist. Shown as thread |
coredns.go.info (gauge) | Information about the Go environment. |
coredns.go.memstats.alloc_bytes (gauge) | Number of bytes allocated and still in use. Shown as byte |
coredns.go.memstats.alloc_bytes_total (count) | Total number of bytes allocated even if freed. Shown as byte |
coredns.go.memstats.buck_hash_sys_bytes (gauge) | Number of bytes used by the profiling bucket hash table. Shown as byte |
coredns.go.memstats.frees_total (count) | Total number of frees. |
coredns.go.memstats.gc_cpu_fraction (gauge) | CPU taken up by GC Shown as percent |
coredns.go.memstats.gc_sys_bytes (gauge) | Number of bytes used for garbage collection system metadata. Shown as byte |
coredns.go.memstats.heap_alloc_bytes (gauge) | Bytes allocated to the heap Shown as byte |
coredns.go.memstats.heap_idle_bytes (gauge) | Number of idle bytes in the heap Shown as byte |
coredns.go.memstats.heap_inuse_bytes (gauge) | Number of Bytes in the heap Shown as byte |
coredns.go.memstats.heap_objects (gauge) | Number of objects in the heap Shown as object |
coredns.go.memstats.heap_released_bytes (gauge) | Number of bytes released to the system in the last gc Shown as byte |
coredns.go.memstats.heap_sys_bytes (gauge) | Number of bytes used by the heap Shown as byte |
coredns.go.memstats.last_gc_time_seconds (gauge) | Length of last GC Shown as second |
coredns.go.memstats.lookups_total (count) | Number of lookups Shown as operation |
coredns.go.memstats.mallocs_total (count) | Number of mallocs Shown as operation |
coredns.go.memstats.mcache_inuse_bytes (gauge) | Number of bytes in use by mcache structures. Shown as byte |
coredns.go.memstats.mcache_sys_bytes (gauge) | Number of bytes used for mcache structures obtained from system. Shown as byte |
coredns.go.memstats.mspan_inuse_bytes (gauge) | Number of bytes in use by mspan structures. Shown as byte |
coredns.go.memstats.mspan_sys_bytes (gauge) | Number of bytes used for mspan structures obtained from system. Shown as byte |
coredns.go.memstats.next_gc_bytes (gauge) | Number of heap bytes when next garbage collection will take place Shown as byte |
coredns.go.memstats.other_sys_bytes (gauge) | Number of bytes used for other system allocations Shown as byte |
coredns.go.memstats.stack_inuse_bytes (gauge) | Number of bytes in use by the stack allocator Shown as byte |
coredns.go.memstats.stack_sys_bytes (gauge) | Number of bytes obtained from system for stack allocator Shown as byte |
coredns.go.memstats.sys_bytes (gauge) | Number of bytes obtained from system Shown as byte |
coredns.go.threads (gauge) | Number of OS threads created. Shown as thread |
coredns.plugin_enabled (gauge) | A metric that indicates whether a plugin is enabled on per server and zone basis. |
coredns.process.cpu_seconds_total (count) | Total user and system CPU time spent in seconds. Shown as second |
coredns.process.max_fds (gauge) | Maximum number of open file descriptors. Shown as file |
coredns.process.open_fds (gauge) | Number of open file descriptors. Shown as file |
coredns.process.resident_memory_bytes (gauge) | Resident memory size in bytes. Shown as byte |
coredns.process.start_time_seconds (gauge) | Start time of the process since unix epoch in seconds. Shown as second |
coredns.process.virtual_memory_bytes (gauge) | Virtual memory size in bytes. Shown as byte |
coredns.template.matches_count (count) | The total number of matched requests by regex. |
coredns.template.failures_count (count) | The number of times the Go templating failed. Shown as error |
coredns.template.rr_failures_count (count) | The number of times the templated resource record was invalid and could not be parsed. Shown as error |
The CoreDNS check does not include any events.
coredns.prometheus.health:
Returns CRITICAL
if the Agent cannot reach the metrics endpoints.
Need help? Contact Datadog support.
See the main documentation for more details about how to test and develop Agent based integrations.
On this Page