Kyverno

Supported OS Linux Windows Mac OS

Integration version2.2.0

Overview

This check monitors Kyverno through the Datadog Agent.

Setup

Follow the instructions below to install and configure this check for an Agent running in your Kubernetes environment. For more information about configuration in containerized environments, see the Autodiscovery Integration Templates for guidance.

Installation

Starting from Agent release 7.56.0, the Kyverno check is included in the Datadog Agent package. No additional installation is needed in your environment.

This check uses OpenMetrics to collect metrics from the OpenMetrics endpoint that Kyverno exposes, which requires Python 3.

Configuration

Kyverno consists of multiple controllers such as Backup, Admissions, Cleanup, and Reports controllers. Each of these controllers can be monitored. Each Kyverno controller has Prometheus-formatted metrics readily available at /metrics on port 8000. For the Agent to start collecting metrics, the Kyverno controller pods need to be annotated. For more information about annotations, refer to the Autodiscovery Integration Templates for guidance. You can find additional configuration options by reviewing the sample kyverno.d/conf.yaml.

Note: The listed metrics can only be collected if they are available. Some metrics are generated only when certain actions are performed. For example, the kyverno.controller.drop.count metric is exposed only after an object is dropped by a controller.

The only parameter required for configuring the Kyverno check is:

openmetrics_endpoint: This parameter should be set to the location where the Prometheus-formatted metrics are exposed. The default port is 8000. In containerized environments, %%host%% should be used for host autodetection.

apiVersion: v1
kind: Pod
# (...)
metadata:
  name: '<POD_NAME>'
  annotations:
    ad.datadoghq.com/<CONTAINER_NAME>.checks: |
      {
        "kyverno": {
          "init_config": {},
          "instances": [
            {
              "openmetrics_endpoint": "http://%%host%%:8000/metrics"
            }
          ]
        }
      }      
    # (...)
spec:
  containers:
    - name: <CONTAINER_NAME> # e.g. 'kyverno' in the Admission controller
# (...)

To collect metrics from each Kyverno controller, the above pod annotations can be applied to each Kyverno controller pod. Example pod annotations for the Reports controller:

# Pod manifest from a basic Helm chart deployment
apiVersion: v1
kind: Pod
# (...)
metadata:
  name: 'controller'
  annotations:
    ad.datadoghq.com/<CONTAINER_NAME>.checks: |
      {
        "kyverno": {
          "init_config": {},
          "instances": [
            {
              "openmetrics_endpoint": "http://%%host%%:8000/metrics"
            }
          ]
        }
      }      
    # (...)
spec:
  containers:
    - name: controller
# (...)

Log collection

Available for Agent versions >6.0

Kyverno logs can be collected from the different Kyverno pods through Kubernetes. Collecting logs is disabled by default in the Datadog Agent. To enable it, see Kubernetes Log Collection.

See the Autodiscovery Integration Templates for guidance on applying the parameters below.

Parameter	Value
`<LOG_CONFIG>`	`{"source": "kyverno", "service": "<SERVICE_NAME>"}`

Validation

Run the Agent’s status subcommand and look for kyverno under the Checks section.

Data Collected

Metrics

kyverno.admission.requests.count (count)	Number of admission requests which were triggered as a part of Kyverno
kyverno.admission.review.duration.seconds.bucket (count)	The bucket aggregation shows the number of observations within each distribution bucket in the admissions review latency histogram
kyverno.admission.review.duration.seconds.count (count)	The count aggregation shows the number of observation in the admissions review latency histogram
kyverno.admission.review.duration.seconds.sum (count)	The sum aggregation of the admissions review latency represents the total amount of seconds it has taken for individual admission reviews corresponding to incoming resource requests that trigger policies and rules Shown as second
kyverno.cleanup.controller.deletedobjects.count (count)	Number of objects deleted by the cleanup controller
kyverno.cleanup.controller.errors.count (count)	Number of errors encountered by the cleanup controller while trying to delete objects
kyverno.client.queries.count (count)	Number of queries per second (QPS) from Kyverno
kyverno.controller.clientset.k8s.request.count (count)	The total number of Kubernetes requests executed during application reconciliation
kyverno.controller.drop.count (count)	Number of times a controller dropped an elements. Dropping usually indicates an unrecoverable error. The controller retried item processing multiple times and after failing each time, dropped the item.
kyverno.controller.reconcile.count (count)	Number of reconciliations performed by various Kyverno controllers
kyverno.controller.requeue.count (count)	Number of times a controller re-queues elements to be processed. Re-queuing usually indicates that an error occurred and that the controller enqueued the same item to retry processing it a bit later.
kyverno.go.gc.duration.seconds.count (count)	The number of observations in the GC duration summary in the Kyverno instance
kyverno.go.gc.duration.seconds.quantile (gauge)	The number of observations in the GC duration summary in the Kyverno instance per quantile
kyverno.go.gc.duration.seconds.sum (count)	The sum of the pause duration of garbage collection cycles in the Kyverno instance Shown as second
kyverno.go.goroutines (gauge)	The number of goroutines that currently exist in the Kyverno instance
kyverno.go.info (gauge)	Metric containing the Go version as a tag
kyverno.go.memstats.alloc_bytes (gauge)	The number of bytes allocated and still in use in the Kyverno instance Shown as byte
kyverno.go.memstats.alloc_bytes.count (count)	The monotonic count of bytes allocated and still in use in the Kyverno instance Shown as byte
kyverno.go.memstats.buck_hash.sys_bytes (gauge)	The number of bytes used by the profiling bucket hash table in the Kyverno instance Shown as byte
kyverno.go.memstats.frees.count (count)	The total number of frees in the Kyverno instance
kyverno.go.memstats.gc.cpu_fraction (gauge)	The fraction of this program's available CPU time used by the GC since the program started in the Kyverno instance Shown as fraction
kyverno.go.memstats.gc.sys_bytes (gauge)	The number of bytes used for garbage collection system metadata in the Kyverno instance Shown as byte
kyverno.go.memstats.heap.alloc_bytes (gauge)	The number of heap bytes allocated and still in use in the Kyverno instance Shown as byte
kyverno.go.memstats.heap.idle_bytes (gauge)	The number of heap bytes waiting to be used in the Kyverno instance Shown as byte
kyverno.go.memstats.heap.inuse_bytes (gauge)	The number of heap bytes that are in use in the Kyverno instance Shown as byte
kyverno.go.memstats.heap.objects (gauge)	The number of allocated objects in the Kyverno instance Shown as object
kyverno.go.memstats.heap.released_bytes (gauge)	The number of heap bytes released to the OS in the Kyverno instance Shown as byte
kyverno.go.memstats.heap.sys_bytes (gauge)	The number of heap bytes obtained from system in the Kyverno instance Shown as byte
kyverno.go.memstats.lookups.count (count)	The number of pointer lookups
kyverno.go.memstats.mallocs.count (count)	The number of mallocs
kyverno.go.memstats.mcache.inuse_bytes (gauge)	The number of bytes in use by mcache structures in the Kyverno instance Shown as byte
kyverno.go.memstats.mcache.sys_bytes (gauge)	The number of bytes used for mcache structures obtained from system in the Kyverno instance Shown as byte
kyverno.go.memstats.mspan.inuse_bytes (gauge)	The number of bytes in use by mspan structures in the Kyverno instance Shown as byte
kyverno.go.memstats.mspan.sys_bytes (gauge)	The number of bytes used for mspan structures obtained from system in the Kyverno instance Shown as byte
kyverno.go.memstats.next.gc_bytes (gauge)	The number of heap bytes when the next garbage collection takes place in the Kyverno instance Shown as byte
kyverno.go.memstats.other.sys_bytes (gauge)	The number of bytes used for other system allocations in the Kyverno instance Shown as byte
kyverno.go.memstats.stack.inuse_bytes (gauge)	The number of bytes in use by the stack allocator in the Kyverno instance Shown as byte
kyverno.go.memstats.stack.sys_bytes (gauge)	The number of bytes obtained from system for stack allocator in the Kyverno instance Shown as byte
kyverno.go.memstats.sys_bytes (gauge)	The number of bytes obtained from system in the Kyverno instance Shown as byte
kyverno.go.threads (gauge)	The number of OS threads created in the Kyverno instance Shown as thread
kyverno.http.requests.count (count)	Number of HTTP requests which were triggered as a part of Kyverno
kyverno.http.requests.duration.seconds.bucket (count)	The bucket aggregation shows the number of observations within each distribution bucket in the HTTP requests latency histogram
kyverno.http.requests.duration.seconds.count (count)	The count aggregation shows the number of observations in the HTTP requests latency histogram
kyverno.http.requests.duration.seconds.sum (count)	The sum aggregation of the HTTP requests represents the total amount of seconds spent on HTTP requests Shown as second
kyverno.policy.changes.count (count)	Metric used to track the history of all Kyverno policy-related changes such as policy creations, updates, and deletions
kyverno.policy.execution.duration.seconds.bucket (count)	The bucket aggregation shows the number of observations within each distribution bucket in the policy rule execution latency histogram
kyverno.policy.execution.duration.seconds.count (count)	The count aggregation shows the number of observations in the policy rule execution latency histogram
kyverno.policy.execution.duration.seconds.sum (count)	The sum aggregation of the policy rule execution represents the total amount of seconds spent on the the execution and processing of the individual rules whenever they evaluate incoming resource requests or execute background scans Shown as second
kyverno.policy.results.count (count)	Metric used to track the results associated with the rules executing as a part of incoming resource requests and even background scans. This metric can be further aggregated to track policy-level results as well
kyverno.policy.rule.info (gauge)	Number of policies and rules present in the cluster. Includes both active and inactive policies and rules.
kyverno.process.cpu.seconds.count (count)	The total user and system CPU time spent in seconds in the Kyverno instance Shown as second
kyverno.process.max_fds (gauge)	The maximum number of open file descriptors in the Kyverno instance
kyverno.process.open_fds (gauge)	The number of open file descriptors in the Kyverno instance
kyverno.process.resident_memory.bytes (gauge)	The resident memory size in bytes in the Kyverno instance Shown as byte
kyverno.process.start_time.seconds (gauge)	The start time of the process since the Unix epoch in seconds in the Kyverno instance Shown as second
kyverno.process.virtual_memory.bytes (gauge)	The virtual memory size in bytes in the Kyverno instance Shown as byte
kyverno.process.virtual_memory.max_bytes (gauge)	The maximum amount of virtual memory available in bytes in the Kyverno instance Shown as byte
kyverno.ttl.controller.deletedobjects.count (count)	Number of objects deleted by the cleanup TTL controller
kyverno.ttl.controller.errors.count (count)	Number of errors encountered by the cleanup TTL controller while trying to delete objects

Events

The kyverno integration does not include any events.

Service Checks

kyverno.openmetrics.health
Returns CRITICAL if the Agent is unable to connect to the Kyverno OpenMetrics endpoint, otherwise returns OK.
Statuses: ok, critical

Troubleshooting

Need help? Contact Datadog support.