---
title: Google Kubernetes Engine
description: >-
  A powerful cluster manager and orchestration system for running your
  containerized applications.
breadcrumbs: Docs > Integrations > Google Kubernetes Engine
---

# Google Kubernetes Engine

{% callout %}
# Important note for users on the following Datadog sites: us2.ddog-gov.com

{% alert level="info" %}
To find out if this integration is available in your organization, see your [Datadog Integrations](https://app.datadoghq.com/integrations) page or ask your organization administrator.

To initiate an exception request to enable this integration for your organization, email [support@ddog-gov.com](mailto:support@ddog-gov.com).
{% /alert %}

{% /callout %}

## Overview{% #overview %}

Google Kubernetes Engine (GKE) is a powerful cluster manager and orchestration system for running your Docker containers.

Get metrics from Google Kubernetes Engine to:

- Visualize the performance of your GKE containers and GKE control plane.
- Correlate the performance of your GKE containers with your applications.

This integration comes with two separate preset dashboards:

- The standard GKE dashboard presents the GKE and GKE control plane metrics collected from the Google integration.
- The enhanced GKE dashboard presents metrics from Datadog's Agent-based Kubernetes integration alongside GKE control plane metrics collected from the Google integration.

The standard dashboard provides observability in GKE with a simple configuration. The enhanced dashboard requires additional configuration steps, but provides more real-time Kubernetes metrics, and is often a better place to start from when cloning and customizing a dashboard for monitoring workloads in production.

Unlike self-hosted Kubernetes clusters, the GKE control plane is managed by Google and not accessible by a Datadog Agent running in the cluster. Therefore, observability into the GKE control plane requires the Google integration even if you are primarily using the Datadog Agent to monitor your clusters.

## Setup{% #setup %}

### Metric collection{% #metric-collection %}

#### Installation{% #installation %}

1. If you haven't already, set up the [Google Cloud Platform integration](https://docs.datadoghq.com/integrations/google-cloud-platform.md) first. There are no other installation steps for the standard metrics and preset dashboard.

1. To populate the enhanced dashboard and enable APM tracing, logging, profiling, security, and other Datadog services, [install the Datadog Agent into your GKE cluster](https://docs.datadoghq.com/integrations/google-kubernetes-engine.md).

1. To populate the control plane metrics, you must [enable GKE control plane metrics](https://cloud.google.com/kubernetes-engine/docs/how-to/configure-metrics#enable-control-plane-metrics). Control plane metrics give you visibility into the operation of the Kubernetes control plane, which is managed by Google in GKE.

### Log collection{% #log-collection %}

Google Kubernetes Engine logs are collected with Google Cloud Logging and sent to a Dataflow job through a Cloud Pub/Sub topic. If you haven't already, [set up logging with the Datadog Dataflow template](https://docs.datadoghq.com/integrations/google-cloud-platform.md?tab=datadogussite#log-collection).

Once this is done, export your Google Kubernetes Engine logs from Google Cloud Logging to the Pub/Sub topic:

1. Go to the [GCP Logs Explorer page](https://console.cloud.google.com/logs/viewer) and filter Kubernetes and GKE logs.
1. Click **Create Sink** and name the sink accordingly.
1. Choose "Cloud Pub/Sub" as the destination and select the Pub/Sub topic that was created for that purpose. **Note**: The Pub/Sub topic can be located in a different project.
1. Click **Create** and wait for the confirmation message to show up.

## Data Collected{% #data-collected %}

### Metrics{% #metrics %}

|  |
|  |
| **gcp.gke.container.accelerator.duty\_cycle**(gauge)                                                                    | Percent of time over the past sample period during which the accelerator was actively processing.*Shown as percent*                                                     |
| **gcp.gke.container.accelerator.memory\_total**(gauge)                                                                  | Total accelerator memory.*Shown as byte*                                                                                                                                |
| **gcp.gke.container.accelerator.memory\_used**(gauge)                                                                   | Total accelerator memory allocated.*Shown as byte*                                                                                                                      |
| **gcp.gke.container.accelerator.request**(gauge)                                                                        | Number of accelerator devices requested by the container.*Shown as device*                                                                                              |
| **gcp.gke.container.cpu.core\_usage\_time**(count)                                                                      | Cumulative CPU usage on all cores used by the container.*Shown as second*                                                                                               |
| **gcp.gke.container.cpu.limit\_cores**(gauge)                                                                           | CPU cores limit of the container.*Shown as core*                                                                                                                        |
| **gcp.gke.container.cpu.limit\_utilization**(gauge)                                                                     | Fraction of the CPU limit that is currently in use on the instance.*Shown as fraction*                                                                                  |
| **gcp.gke.container.cpu.request\_cores**(gauge)                                                                         | Number of CPU cores requested by the container.*Shown as core*                                                                                                          |
| **gcp.gke.container.cpu.request\_utilization**(gauge)                                                                   | Fraction of the requested CPU that is currently in use on the instance.*Shown as fraction*                                                                              |
| **gcp.gke.container.ephemeral\_storage.limit\_bytes**(gauge)                                                            | Local ephemeral storage limit.*Shown as byte*                                                                                                                           |
| **gcp.gke.container.ephemeral\_storage.request\_bytes**(gauge)                                                          | Local ephemeral storage request.*Shown as byte*                                                                                                                         |
| **gcp.gke.container.ephemeral\_storage.used\_bytes**(gauge)                                                             | Local ephemeral storage usage.*Shown as byte*                                                                                                                           |
| **gcp.gke.container.memory.limit\_bytes**(gauge)                                                                        | Memory limit of the container.*Shown as byte*                                                                                                                           |
| **gcp.gke.container.memory.limit\_utlization**(gauge)                                                                   | Fraction of the memory limit that is currently in use on the instance.*Shown as fraction*                                                                               |
| **gcp.gke.container.memory.page\_fault\_count**(count)                                                                  | Number of page faults, broken down by type.*Shown as fault*                                                                                                             |
| **gcp.gke.container.memory.request\_bytes**(gauge)                                                                      | Memory request of the container.*Shown as byte*                                                                                                                         |
| **gcp.gke.container.memory.request\_utilization**(gauge)                                                                | Fraction of the requested memory that is currently in use on the instance.*Shown as fraction*                                                                           |
| **gcp.gke.container.memory.used\_bytes**(gauge)                                                                         | Memory usage of the container.*Shown as byte*                                                                                                                           |
| **gcp.gke.container.restart\_count**(count)                                                                             | Number of times the container has restarted.*Shown as occurrence*                                                                                                       |
| **gcp.gke.container.uptime**(gauge)                                                                                     | Time in seconds that the container has been running.*Shown as second*                                                                                                   |
| **gcp.gke.node.cpu.allocatable\_cores**(gauge)                                                                          | Number of allocatable CPU cores on the node.*Shown as core*                                                                                                             |
| **gcp.gke.node.cpu.allocatable\_utilization**(gauge)                                                                    | Fraction of the allocatable CPU that is currently in use on the instance.*Shown as fraction*                                                                            |
| **gcp.gke.node.cpu.core\_usage\_time**(count)                                                                           | Cumulative CPU usage on all cores used on the node.*Shown as second*                                                                                                    |
| **gcp.gke.node.cpu.total\_cores**(gauge)                                                                                | Total number of CPU cores on the node.*Shown as core*                                                                                                                   |
| **gcp.gke.node.ephemeral\_storage.allocatable\_bytes**(gauge)                                                           | Local ephemeral storage bytes allocatable on the node.*Shown as byte*                                                                                                   |
| **gcp.gke.node.ephemeral\_storage.inodes\_free**(gauge)                                                                 | Free number of inodes on local ephemeral storage.                                                                                                                       |
| **gcp.gke.node.ephemeral\_storage.inodes\_total**(gauge)                                                                | Total number of inodes on local ephemeral storage.                                                                                                                      |
| **gcp.gke.node.ephemeral\_storage.total\_bytes**(gauge)                                                                 | Total ephemeral storage bytes on the node.*Shown as byte*                                                                                                               |
| **gcp.gke.node.ephemeral\_storage.used\_bytes**(gauge)                                                                  | Local ephemeral storage bytes used by the node.*Shown as byte*                                                                                                          |
| **gcp.gke.node.memory.allocatable\_bytes**(gauge)                                                                       | Cumulative memory bytes used by the node.*Shown as byte*                                                                                                                |
| **gcp.gke.node.memory.allocatable\_utilization**(gauge)                                                                 | Fraction of the allocatable memory that is currently in use on the instance.*Shown as fraction*                                                                         |
| **gcp.gke.node.memory.total\_bytes**(gauge)                                                                             | Number of bytes of memory allocatable on the node.*Shown as byte*                                                                                                       |
| **gcp.gke.node.memory.used\_bytes**(gauge)                                                                              | Cumulative memory bytes used by the node.*Shown as byte*                                                                                                                |
| **gcp.gke.node.network.received\_bytes\_count**(count)                                                                  | Cumulative number of bytes received by the node over the network.*Shown as byte*                                                                                        |
| **gcp.gke.node.network.sent\_bytes\_count**(count)                                                                      | Cumulative number of bytes transmitted by the node over the network.*Shown as byte*                                                                                     |
| **gcp.gke.node.pid\_limit**(gauge)                                                                                      | Max PID of OS on the node.                                                                                                                                              |
| **gcp.gke.node.pid\_used**(gauge)                                                                                       | Number of running process in the OS on the node.                                                                                                                        |
| **gcp.gke.node\_daemon.cpu.core\_usage\_time**(count)                                                                   | Cumulative CPU usage on all cores used by the node level system daemon.*Shown as second*                                                                                |
| **gcp.gke.node\_daemon.memory.used\_bytes**(gauge)                                                                      | Memory usage by the system daemon.*Shown as byte*                                                                                                                       |
| **gcp.gke.pod.network.received\_bytes\_count**(count)                                                                   | Cumulative number of bytes received by the pod over the network.*Shown as byte*                                                                                         |
| **gcp.gke.pod.network.sent\_bytes\_count**(count)                                                                       | Cumulative number of bytes transmitted by the pod over the network.*Shown as byte*                                                                                      |
| **gcp.gke.pod.volume.total\_bytes**(gauge)                                                                              | Total number of disk bytes available to the pod.*Shown as byte*                                                                                                         |
| **gcp.gke.pod.volume.used\_bytes**(gauge)                                                                               | Number of disk bytes used by the pod.*Shown as byte*                                                                                                                    |
| **gcp.gke.pod.volume.utilization**(gauge)                                                                               | Fraction of the volume that is currently being used by the instance.*Shown as fraction*                                                                                 |
| **gcp.gke.control\_plane.apiserver.admission\_controller\_admission\_duration\_seconds**(gauge)                         | Admission controller latency histogram in seconds, identified by name and broken out for each operation and API resource and type (validate or admit).*Shown as second* |
| **gcp.gke.control\_plane.apiserver.admission\_step\_admission\_duration\_seconds**(gauge)                               | Admission sub-step latency histogram in seconds, broken out for each operation and API resource and step type (validate or admit).*Shown as second*                     |
| **gcp.gke.control\_plane.apiserver.admission\_webhook\_admission\_duration\_seconds**(gauge)                            | Admission webhook latency histogram in seconds, identified by name and broken out for each operation and API resource and type (validate or admit).*Shown as second*    |
| **gcp.gke.control\_plane.apiserver.current\_inflight\_requests**(gauge)                                                 | Maximal number of currently used inflight request limit of this apiserver per request kind.*Shown as request*                                                           |
| **gcp.gke.control\_plane.apiserver.request\_duration\_seconds**(gauge)                                                  | Response latency distribution in seconds for each verb, dry run value, group, version, resource, subresource, scope and component.*Shown as second*                     |
| **gcp.gke.control\_plane.apiserver.request\_total**(gauge)                                                              | Counter of apiserver requests broken out for each verb, dry run value, group, version, resource, scope, component, and HTTP response code.*Shown as request*            |
| **gcp.gke.control\_plane.apiserver.response\_sizes**(gauge)                                                             | Response size distribution in bytes for each group, version, verb, resource, subresource, scope and component.*Shown as byte*                                           |
| **gcp.gke.control\_plane.apiserver.storage\_objects**(gauge)                                                            | Number of stored objects at the time of last check split by kind.*Shown as object*                                                                                      |
| **gcp.gke.control\_plane.controller\_manager.node\_collector\_evictions\_number**(count)                                | Number of Node evictions that happened since current instance of NodeController started.*Shown as event*                                                                |
| **gcp.gke.control\_plane.scheduler.pending\_pods**(gauge)                                                               | Number of pending pods, by the queue type.*Shown as event*                                                                                                              |
| **gcp.gke.control\_plane.scheduler.pod\_scheduling\_duration\_seconds**(gauge)                                          | E2e latency for a pod being scheduled*Shown as second*                                                                                                                  |
| **gcp.gke.control\_plane.scheduler.preemption\_attempts\_total**(count)                                                 | Total preemption attempts in the cluster till now*Shown as attempt*                                                                                                     |
| **gcp.gke.control\_plane.scheduler.preemption\_victims**(gauge)                                                         | Number of selected preemption victims*Shown as event*                                                                                                                   |
| **gcp.gke.control\_plane.scheduler.scheduling\_attempt\_duration\_seconds**(gauge)                                      | Scheduling attempt latency in seconds*Shown as second*                                                                                                                  |
| **gcp.gke.control\_plane.scheduler.schedule\_attempts\_total**(gauge)                                                   | Number of attempts to schedule pods.*Shown as attempt*                                                                                                                  |
| **gcp.gke.control\_plane.apiserver.aggregator\_unavailable\_apiservice**(gauge)                                         | (Deprecated)                                                                                                                                                            |
| **gcp.gke.control\_plane.apiserver.audit\_event\_total**(gauge)                                                         | (Deprecated) Accumulated number audit events generated and sent to the audit backend*Shown as event*                                                                    |
| **gcp.gke.control\_plane.apiserver.audit\_level\_total**(gauge)                                                         | (Deprecated)                                                                                                                                                            |
| **gcp.gke.control\_plane.apiserver.audit\_requests\_rejected\_total**(gauge)                                            | (Deprecated)*Shown as request*                                                                                                                                          |
| **gcp.gke.control\_plane.apiserver.client\_certificate\_expiration\_seconds**(gauge)                                    | (Deprecated)*Shown as second*                                                                                                                                           |
| **gcp.gke.control\_plane.apiserver.etcd\_object\_counts**(gauge)                                                        | (Deprecated) Number of stored objects split by kind.*Shown as object*                                                                                                   |
| **gcp.gke.control\_plane.apiserver.etcd\_request\_duration\_seconds**(gauge)                                            | (Deprecated)*Shown as second*                                                                                                                                           |
| **gcp.gke.control\_plane.apiserver.init\_events\_total**(gauge)                                                         | (Deprecated)*Shown as event*                                                                                                                                            |
| **gcp.gke.control\_plane.apiserver.longrunning\_gauge**(gauge)                                                          | (Deprecated) Gauge of all active long-running apiserver requests.*Shown as request*                                                                                     |
| **gcp.gke.control\_plane.apiserver.registered\_watchers**(gauge)                                                        | (Deprecated) Number of currently registered watchers for a given resource.*Shown as object*                                                                             |
| **gcp.gke.control\_plane.apiserver.workqueue\_adds\_total**(count)                                                      | (Deprecated)                                                                                                                                                            |
| **gcp.gke.control\_plane.apiserver.workqueue\_depth**(gauge)                                                            | (Deprecated)                                                                                                                                                            |
| **gcp.gke.control\_plane.apiserver.workqueue\_longest\_running\_processor\_seconds**(gauge)                             | (Deprecated) Number of seconds that the longest running processor has been running.*Shown as second*                                                                    |
| **gcp.gke.control\_plane.apiserver.workqueue\_queue\_duration\_seconds**(gauge)                                         | (Deprecated)*Shown as second*                                                                                                                                           |
| **gcp.gke.control\_plane.apiserver.workqueue\_retries\_total**(count)                                                   | (Deprecated)                                                                                                                                                            |
| **gcp.gke.control\_plane.apiserver.workqueue\_unfinished\_work\_seconds**(gauge)                                        | (Deprecated)*Shown as second*                                                                                                                                           |
| **gcp.gke.control\_plane.apiserver.workqueue\_work\_duration\_seconds**(gauge)                                          | (Deprecated)*Shown as second*                                                                                                                                           |
| **gcp.gke.control\_plane.controller\_manager.cloudprovider\_gce\_api\_request\_duration\_seconds**(gauge)               | (Deprecated)*Shown as second*                                                                                                                                           |
| **gcp.gke.control\_plane.controller\_manager.cronjob\_controller\_rate\_limiter\_use**(gauge)                           | (Deprecated) Usage of the rate limiter by cronjob controller                                                                                                            |
| **gcp.gke.control\_plane.controller\_manager.daemon\_controller\_rate\_limiter\_use**(gauge)                            | (Deprecated) Usage of the rate limiter by daemon controller                                                                                                             |
| **gcp.gke.control\_plane.controller\_manager.deployment\_controller\_rate\_limiter\_use**(gauge)                        | (Deprecated) Usage of the rate limiter by deployment controller                                                                                                         |
| **gcp.gke.control\_plane.controller\_manager.endpoint\_controller\_rate\_limiter\_use**(gauge)                          | (Deprecated) Usage of the rate limiter by endpoint controller                                                                                                           |
| **gcp.gke.control\_plane.controller\_manager.gc\_controller\_rate\_limiter\_use**(gauge)                                | (Deprecated) Usage of the rate limiter by GC controller                                                                                                                 |
| **gcp.gke.control\_plane.controller\_manager.job\_controller\_rate\_limiter\_use**(gauge)                               | (Deprecated) Usage of the rate limiter by job controller                                                                                                                |
| **gcp.gke.control\_plane.controller\_manager.leader\_election\_master\_status**(gauge)                                  | (Deprecated)                                                                                                                                                            |
| **gcp.gke.control\_plane.controller\_manager.namespace\_controller\_rate\_limiter\_use**(gauge)                         | (Deprecated) Usage of the rate limiter by namespace controller                                                                                                          |
| **gcp.gke.control\_plane.controller\_manager.node\_collector\_evictions\_number**(count)                                | (Deprecated) Count of node eviction events.                                                                                                                             |
| **gcp.gke.control\_plane.controller\_manager.node\_collector\_unhealthy\_nodes\_in\_zone**(gauge)                       | (Deprecated) Number of unhealthy nodes                                                                                                                                  |
| **gcp.gke.control\_plane.controller\_manager.node\_collector\_zone\_health**(gauge)                                     | (Deprecated)                                                                                                                                                            |
| **gcp.gke.control\_plane.controller\_manager.node\_collector\_zone\_size**(gauge)                                       | (Deprecated)                                                                                                                                                            |
| **gcp.gke.control\_plane.controller\_manager.node\_ipam\_controller\_rate\_limiter\_use**(gauge)                        | (Deprecated) Usage of the rate limiter by IPAM controller                                                                                                               |
| **gcp.gke.control\_plane.controller\_manager.node\_lifecycle\_controller\_rate\_limiter\_use**(gauge)                   | (Deprecated) Usage of the rate limiter by lifecycle controller                                                                                                          |
| **gcp.gke.control\_plane.controller\_manager.persistentvolume\_protection\_controller\_rate\_limiter\_use**(gauge)      | (Deprecated) Usage of the rate limiter by persistent volume protection controller                                                                                       |
| **gcp.gke.control\_plane.controller\_manager.persistentvolumeclaim\_protection\_controller\_rate\_limiter\_use**(gauge) | (Deprecated) Usage of the rate limiter by persistent volume claim protection controller                                                                                 |
| **gcp.gke.control\_plane.controller\_manager.replicaset\_controller\_rate\_limiter\_use**(gauge)                        | (Deprecated) Usage of the rate limiter by ReplicaSet controller                                                                                                         |
| **gcp.gke.control\_plane.controller\_manager.replication\_controller\_rate\_limiter\_use**(gauge)                       | (Deprecated) Usage of the rate limiter by replication controller                                                                                                        |
| **gcp.gke.control\_plane.controller\_manager.route\_controller\_rate\_limiter\_use**(gauge)                             | (Deprecated) Usage of the rate limiter by route controller                                                                                                              |
| **gcp.gke.control\_plane.controller\_manager.service\_controller\_rate\_limiter\_use**(gauge)                           | (Deprecated) Usage of the rate limiter by service controller                                                                                                            |
| **gcp.gke.control\_plane.controller\_manager.serviceaccount\_controller\_rate\_limiter\_use**(gauge)                    | (Deprecated) Usage of the rate limiter by service account controller                                                                                                    |
| **gcp.gke.control\_plane.controller\_manager.serviceaccount\_tokens\_controller\_rate\_limiter\_use**(gauge)            | (Deprecated) Usage of the rate limiter by service account tokens controller                                                                                             |
| **gcp.gke.control\_plane.controller\_manager.workqueue\_adds\_total**(count)                                            | (Deprecated)                                                                                                                                                            |
| **gcp.gke.control\_plane.controller\_manager.workqueue\_depth**(gauge)                                                  | (Deprecated)                                                                                                                                                            |
| **gcp.gke.control\_plane.controller\_manager.workqueue\_longest\_running\_processor\_seconds**(gauge)                   | (Deprecated) Number of seconds that the longest running processor has been running.*Shown as second*                                                                    |
| **gcp.gke.control\_plane.controller\_manager.workqueue\_queue\_duration\_seconds**(gauge)                               | (Deprecated)*Shown as second*                                                                                                                                           |
| **gcp.gke.control\_plane.controller\_manager.workqueue\_retries\_total**(count)                                         | (Deprecated)                                                                                                                                                            |
| **gcp.gke.control\_plane.controller\_manager.workqueue\_unfinished\_work\_seconds**(gauge)                              | (Deprecated)*Shown as second*                                                                                                                                           |
| **gcp.gke.control\_plane.controller\_manager.workqueue\_work\_duration\_seconds**(gauge)                                | (Deprecated)*Shown as second*                                                                                                                                           |
| **gcp.gke.control\_plane.scheduler.binding\_duration\_seconds**(gauge)                                                  | (Deprecated) Number of latency in seconds.*Shown as second*                                                                                                             |
| **gcp.gke.control\_plane.scheduler.e2e\_scheduling\_duration\_seconds**(gauge)                                          | (Deprecated) Total e2e scheduling latency.*Shown as second*                                                                                                             |
| **gcp.gke.control\_plane.scheduler.framework\_extension\_point\_duration\_seconds**(gauge)                              | (Deprecated)*Shown as second*                                                                                                                                           |
| **gcp.gke.control\_plane.scheduler.leader\_election\_master\_status**(gauge)                                            | (Deprecated)                                                                                                                                                            |
| **gcp.gke.control\_plane.scheduler.scheduling\_algorithm\_duration\_seconds**(gauge)                                    | (Deprecated) Total scheduling algorithm latency.*Shown as second*                                                                                                       |
| **gcp.gke.control\_plane.scheduler.scheduling\_algorithm\_preemption\_evaluation\_seconds**(gauge)                      | (Deprecated)*Shown as second*                                                                                                                                           |

### Events{% #events %}

The Google Kubernetes Engine integration does not include any events.

### Service Checks{% #service-checks %}

The Google Kubernetes Engine integration does not include any service checks.

## Troubleshooting{% #troubleshooting %}

Need help? Contact [Datadog support](https://docs.datadoghq.com/help/).