The Service Map for APM is here!

Google Cloud Platform

Crawler Crawler

Overview

Connect to Google Cloud Platform to see all your Google Compute Engine (GCE) hosts in Datadog. You’ll see your hosts in the infrastructure overview in Datadog and be able to sort through them easily, since Datadog automatically tags them with GCE host tags and any GCP labels you’ve added.

Related integrations include:

App Engine platform as a service to build scalable applications
Big Query enterprise data warehouse
CloudSQL MySQL database service
Compute Engine high performance virtual machines
Container Engine kubernetes, managed by google
Datastore NoSQL database
Firebase mobile platform for application development
Functions A serverless platform for building event-based microservices
Machine Learning machine learning services
Pub/Sub real-time messaging service
Spanner horizontally scalable, globally consistent, relational database service
Stackdriver Logging real-time log management and analysis
Storage unified object storage
VPN managed network functionality

Setup

Installation

The Datadog <> Google Cloud integration uses Service Accounts to create an API connection between Google Cloud and Datadog. Below are instructions for creating the service account and providing Datadog the service account credentials to begin making API calls on your behalf.

  1. Navigate to Google Cloud credentials page for the Google Cloud project you would like to setup the Datadog integration
  2. Press Create credentials and then select Service account key

    settings
  3. In the Service account dropdown, select New service account

  4. Give the service account a unique name

  5. For Role, select Compute engine —> Compute Viewer and Monitoring —> Monitoring Viewer

    Note, these roles allow us to collect metrics, tags, events, and GCE labels on your behalf.

  6. Select JSON as the key type, and press create

  7. Take note where this file is saved, as it is needed to complete the integration

  8. Navigate to the Datadog Google Cloud Integration tile

  9. Select Upload Key File to integrate this project with Datadog

  10. Optionally, you can use tags to filter out hosts from being included in this integration. Detailed instructions on this can be found below

    settings
  11. Press Install/Update

  12. For each project you want to monitor, repeat this process

Google Cloud billing, the Stackdriver Monitoring API and the Compute Engine API must all be enabled for the project(s) you wish to monitor.

Configuration

Optionally, you can limit the GCE instances that are pulled into Datadog by entering tags in the Limit Metric Collection textbox. Only hosts that match one of the defined tags will be imported into Datadog. You can use wildcards (‘?’ for single character, ‘*’ for multi-character) to match many hosts, or ‘!’ to exclude certain hosts. This example includes all c1 sized instances, but excludes staging hosts:

datadog:monitored,env:production,!env:staging,instance-type:c1.*

Data Collected

Metrics

gcp.bigtable.cluster.cpu_load
(gauge)
CPU load of a cluster.
shown as percent
gcp.bigtable.cluster.cpu_load_hottest_node
(gauge)
CPU load of the busiest node in a cluster.
shown as percent
gcp.bigtable.cluster.disk_load
(gauge)
Utilization of HDD disks in a cluster
shown as percent
gcp.bigtable.cluster.node_count
(gauge)
Number of nodes in a cluster.
shown as node
gcp.bigtable.cluster.storage_utilization
(gauge)
Storage used as a fraction of total storage capacity.
shown as percent
gcp.bigtable.disk.bytes_used
(gauge)
Amount of compressed data for tables stored in a cluster.
shown as byte
gcp.bigtable.server.error_count
(gauge)
Number of server requests for a table that failed with an error.
shown as error
gcp.bigtable.server.latencies.avg
(gauge)
Distribution of server request latencies for a table.
shown as millisecond
gcp.bigtable.server.latencies.samplecount
(gauge)
Distribution of replication request latencies for a table. Includes only requests that have been received by the destination cluster
shown as millisecond
gcp.bigtable.server.latencies.sumsqdev
(gauge)
Sum of Squared Deviation for replication latencies between clusters of a table. Indicates the time frame during which latency information may not be accurate.
shown as second
gcp.bigtable.server.modified_rows_count
(gauge)
Number of rows modified by server requests for a table.
gcp.bigtable.server.received_bytes_count
(gauge)
Number of uncompressed bytes of request data received by servers for a table.
shown as byte
gcp.bigtable.server.request_count
(gauge)
Number of server requests for a table.
shown as request
gcp.bigtable.server.returned_rows_count
(gauge)
Number of rows returned by server requests for a table.
gcp.bigtable.server.sent_bytes_count
(gauge)
Number of uncompressed bytes of response data sent by servers for a table.
shown as byte
gcp.bigtable.table.bytes_used
(gauge)
Amount of compressed data stored in a table.
shown as byte
gcp.loadbalancing.https.backend_latencies.avg
(gauge)
Average latency of request sent by the proxy to backend until proxy receives last byte of response from backend.
shown as millisecond
gcp.loadbalancing.https.backend_latencies.samplecount
(count)
Sample Count of latency of request sent by the proxy to backend until proxy receives last byte of response from backend.
shown as millisecond
gcp.loadbalancing.https.backend_latencies.sumsqdev
(gauge)
Sum of Squared Deviation for latency of request sent by the proxy to backend until proxy receives last byte of response from backend.
shown as millisecond
gcp.loadbalancing.https.frontend_tcp_rtt.avg
(gauge)
Average RTT for each connection between client and proxy.
shown as millisecond
gcp.loadbalancing.https.frontend_tcp_rtt.samplecount
(count)
Sample Count of RTT for each connection between client and proxy.
shown as millisecond
gcp.loadbalancing.https.frontend_tcp_rtt.sumsqdev
(gauge)
Sum of Squared Deviation of RTT for each connection between client and proxy.
shown as millisecond
gcp.loadbalancing.https.request_bytes_count
(count)
Bytes sent as requests from clients to L7 load balancer.
shown as byte
gcp.loadbalancing.https.request_count
(count)
Number of requests served by L7 load balancer.
shown as request
gcp.loadbalancing.https.response_bytes_count
(count)
Bytes sent as responses from L7 load balancer to clients.
shown as byte
gcp.loadbalancing.https.total_latencies.avg
(gauge)
Average latency calculated from request received by proxy until proxy sees ACK from client on last response byte.
shown as millisecond
gcp.loadbalancing.https.total_latencies.samplecount
(count)
Sample Count of latency calculated from request received by proxy until proxy sees ACK from client on last response byte.
shown as millisecond
gcp.loadbalancing.https.total_latencies.sumsqdev
(gauge)
Sum of Squared Deviation of latency calculated from request received by proxy until proxy sees ACK from client on last response byte.
shown as millisecond
gcp.loadbalancing.tcp_ssl_proxy.closed_connections
(count)
Number of connections that were terminated over TCP/SSL proxy.
shown as connection
gcp.loadbalancing.tcp_ssl_proxy.egress_bytes_count
(count)
Number of bytes sent from VM to client using proxy.
shown as byte
gcp.loadbalancing.tcp_ssl_proxy.frontend_tcp_rtt.avg
(gauge)
Average smoothed RTT measured by the proxy's TCP stack. Each minute application layer bytes pass from proxy to client.
shown as millisecond
gcp.loadbalancing.tcp_ssl_proxy.frontend_tcp_rtt.samplecount
(count)
Sample count of smoothed RTT measured by the proxy's TCP stack. Each minute application layer bytes pass from proxy to client.
shown as millisecond
gcp.loadbalancing.tcp_ssl_proxy.frontend_tcp_rtt.sumsqdev
(gauge)
Sum of squared deviation of smoothed RTT measured by the proxy's TCP stack. Each minute application layer bytes pass from proxy to client.
shown as millisecond
gcp.loadbalancing.tcp_ssl_proxy.ingress_bytes_count
(count)
Number of bytes sent from client to VM using proxy.
shown as byte
gcp.loadbalancing.tcp_ssl_proxy.new_connections
(count)
Number of connections that were created over TCP/SSL proxy.
shown as connection
gcp.loadbalancing.tcp_ssl_proxy.open_connections
(count)
Current number of outstanding connections through the TCP/SSL proxy.
shown as connection
gcp.interconnect.network.attachment.received_bytes_count
(count)
Number of inbound bytes received.
shown as byte
gcp.interconnect.network.attachment.received_packets_count
(count)
Number of inbound packets received.
shown as packet
gcp.interconnect.network.attachment.sent_bytes_count
(count)
Number of outbound bytes sent.
shown as byte
gcp.interconnect.network.attachment.sent_packets_count
(count)
Number of outbound packets sent.
shown as packet
gcp.interconnect.network.interconnect.capacity
(gauge)
Active capacity of the interconnect.
shown as byte
gcp.interconnect.network.interconnect.dropped_packets_count
(count)
Number of outbound packets dropped due to link congestion.
shown as packet
gcp.interconnect.network.interconnect.link.operational
(gauge)
Whether the operational status of the circuit is up.
gcp.interconnect.network.interconnect.link.rx_power
(gauge)
Light level received over physical circuit.
gcp.interconnect.network.interconnect.link.tx_power
(gauge)
Light level transmitted over physical circuit.
gcp.interconnect.network.interconnect.operational
(gauge)
Whether the operational status of the interconnect is up.
gcp.interconnect.network.interconnect.receive_errors_count
(count)
Number of errors encountered while receiving packets.
shown as error
gcp.interconnect.network.interconnect.received_bytes_count
(count)
Number of inbound bytes received.
shown as byte
gcp.interconnect.network.interconnect.received_unicast_packets_count
(count)
Number of inbound unicast packets received.
shown as packet
gcp.interconnect.network.interconnect.send_errors_count
(count)
Number of errors encountered while sending packets.
shown as error
gcp.interconnect.network.interconnect.sent_bytes_count
(count)
Number of outbound bytes sent.
shown as byte
gcp.interconnect.network.interconnect.sent_unicast_packets_count
(count)
Number of outbound unicast packets sent.
shown as packet

Events

All service events generated by your Google Cloud Platform are forwarded to your Datadog event stream. Other events captured in Stackdriver are not currently available but will be in the future with the Datadog Log management product.

Service Checks

The Google Cloud Platform integration does not include any service check at this time.

Troubleshooting

Incorrect metadata for user defined gcp.logging metrics?

For non-standard gcp.logging metrics (i.e. metrics beyond our out of the box logging metrics), the metadata we apply may not be consistent with Stackdriver.

In these cases, the metadata should be manually set by navigating to the metric summary page, searching and selecting the metric in question, and clicking the Pencil icon next to metadata.

Need help? Contact Datadog Support.

Further Reading

Knowledge Base

Tags Assigned

Tags are automatically assigned based on a variety of configuration options with regards to Google Cloud Platform and the Google Compute Engine. The following tags will be automatically assigned:

  • Zone
  • Instance-type
  • Instance-id
  • Automatic-restart
  • On-host-maintenance
  • Project
  • Numeric_project_id
  • Name

Also, any hosts with <key>:<value> labels get tagged with them.