Google Cloud Run

Overview

Cloud Run is a managed compute platform that enables you to run stateless containers that are invocable using HTTP requests.

Enable this integration and instrument your container to see all of your Cloud Run metrics, traces, and logs in Datadog.

For more information about Cloud Run for Anthos, see the Google Cloud Run for Anthos documentation.

Setup

Metric collection

Installation

Set up the Google Cloud Platform integration to begin collecting out-of-the-box metrics. To set up custom metrics, see the Serverless documentation.

Log collection

Integration

Google Cloud Run also exposes audit logs. Google Cloud Run logs are collected with Google Cloud Logging and sent to a Dataflow job through a Cloud Pub/Sub topic. If you haven’t already, set up logging with the Datadog Dataflow template.

Once this is done, export your Google Cloud Run logs from Google Cloud Logging to the Pub/Sub topic:

Go to the Google Cloud Logging page and filter Google Cloud Run logs.
Click Create Sink and name the sink accordingly.
Choose “Cloud Pub/Sub” as the destination and select the Pub/Sub topic that was created for that purpose. Note: The Pub/Sub topic can be located in a different project.
Click Create and wait for the confirmation message to show up.

Direct Logging

For more information about direct application logging to Datadog from your Cloud Run services, see the Serverless documentation.

Tracing

For more information about specialized Agent setup instructions for fully managed Google Cloud Run, see the Serverless documentation.

Data Collected

Metrics


gcp.run.container.billable_instance_time (rate)	Billable time aggregated from all container instances of the revision (ms/s). Shown as millisecond
gcp.run.container.completed_probe_attempt_count (count)	Number of completed health check probe attempts and their results.
gcp.run.container.completed_probe_count (count)	Number of completed health check probes and their results.
gcp.run.container.containers (gauge)	Number of container instances that exist, broken down by state.
gcp.run.container.cpu.allocation_time (rate)	Container CPU allocation of the revision in seconds. Shown as core
gcp.run.container.cpu.usage.avg (gauge)	The average actual container CPU usage in CPU seconds broken down by the metric field, container name. Shown as second
gcp.run.container.cpu.usage.samplecount (gauge)	The sample count for actual container CPU usage in CPU seconds broken down by the metric field, container name. Shown as second
gcp.run.container.cpu.usage.sumsqdev (gauge)	The sum of squared deviation for actual container CPU usage in CPU seconds broken down by the metric field, container name. Shown as second
gcp.run.container.cpu.utilizations.avg (gauge)	The average distribution of container CPU utilization distribution across all container instances of the revision. Shown as fraction
gcp.run.container.cpu.utilizations.p95 (gauge)	The 95th percentile distribution of container CPU utilization distribution across all container instances of the revision. Shown as fraction
gcp.run.container.cpu.utilizations.p99 (gauge)	The 99th percentile distribution of container CPU utilization distribution across all container instances of the revision. Shown as fraction
gcp.run.container.cpu.utilizations.samplecount (count)	Sample count of the distribution of service request times in milliseconds. Shown as fraction
gcp.run.container.gpu.memory_usages.avg (gauge)	The average container GPU memory usage distribution across all container instances. Shown as byte
gcp.run.container.gpu.memory_usages.samplecount (gauge)	The sample count for container GPU memory usage distribution across all container instances. Shown as byte
gcp.run.container.gpu.memory_usages.sumsqdev (gauge)	The sum of squared deviation for container GPU memory usage distribution across all container instances. Shown as byte
gcp.run.container.gpu.memory_utilizations.avg (gauge)	The average container GPU memory utilization distribution across all container instances.
gcp.run.container.gpu.memory_utilizations.samplecount (gauge)	The sample count for container GPU memory utilization distribution across all container instances.
gcp.run.container.gpu.memory_utilizations.sumsqdev (gauge)	The sum of squared deviation for container GPU memory utilization distribution across all container instances.
gcp.run.container.gpu.utilizations.avg (gauge)	The average container GPU utilization distribution across all container instances.
gcp.run.container.gpu.utilizations.samplecount (gauge)	The sample count for container GPU utilization distribution across all container instances.
gcp.run.container.gpu.utilizations.sumsqdev (gauge)	The sum of squared deviation for container GPU utilization distribution across all container instances.
gcp.run.container.instance_count (gauge)	The number of container instances that exist, broken down by state. Shown as container
gcp.run.container.max_request_concurrencies.avg (gauge)	Average of the maximum number of concurrent requests being served by each container instance over a minute. Shown as request
gcp.run.container.max_request_concurrencies.p95 (gauge)	95th percentile distribution of the maximum number of concurrent requests being served by each container instance over a minute. Shown as request
gcp.run.container.max_request_concurrencies.p99 (gauge)	99th percentile distribution of the maximum number of concurrent requests being served by each container instance over a minute. Shown as request
gcp.run.container.max_request_concurrencies.samplecount (count)	Sample count of the distribution of the maximum number of concurrent requests being served by each container instance over a minute. Shown as request
gcp.run.container.memory.allocation_time (rate)	Container memory allocation of the revision in Gigabyte-seconds. Shown as gibibyte
gcp.run.container.memory.usage.avg (gauge)	The average actual container memory usage in bytes broken down by the metric field, container name. Shown as byte
gcp.run.container.memory.usage.samplecount (gauge)	The sample count for actual container memory usage in bytes broken down by the metric field, container name. Shown as byte
gcp.run.container.memory.usage.sumsqdev (gauge)	The sum of squared deviation for actual container memory usage in bytes broken down by the metric field, container name. Shown as byte
gcp.run.container.memory.utilizations.avg (gauge)	Average of the container memory utilization distribution across all container instances of the revision. Shown as fraction
gcp.run.container.memory.utilizations.p95 (gauge)	95th percentile distribution of the container memory utilization distribution across all container instances of the revision. Shown as fraction
gcp.run.container.memory.utilizations.p99 (gauge)	99th percentile distribution of the container memory utilization distribution across all container instances of the revision. Shown as fraction
gcp.run.container.memory.utilizations.samplecount (count)	Sample count of the container memory utilization distribution across all container instances of the revision. Shown as fraction
gcp.run.container.network.received_bytes_count (count)	The incoming socket and HTTP response traffic of revision, in bytes. Shown as byte
gcp.run.container.network.sent_bytes_count (count)	The outgoing socket and HTTP response traffic of revision, in bytes. Shown as byte
gcp.run.container.network.throttled_inbound_bytes_count (count)	Inbound bytes dropped due to network throttling. Shown as byte
gcp.run.container.network.throttled_inbound_packets_count (count)	Inbound packets dropped due to network throttling. Shown as byte
gcp.run.container.network.throttled_outbound_bytes_count (count)	Outbound bytes dropped due to network throttling. Shown as byte
gcp.run.container.network.throttled_outbound_packets_count (count)	Outbound packets dropped due to network throttling. Shown as byte
gcp.run.container.probe_attempt_latencies.avg (count)	The average distribution of time spent running a single probe attempt before success or failure in milliseconds. Shown as millisecond
gcp.run.container.probe_attempt_latencies.samplecount (count)	The sample count for distribution of time spent running a single probe attempt before success or failure in milliseconds. Shown as millisecond
gcp.run.container.probe_attempt_latencies.sumsqdev (count)	The sum of squared deviation for distribution of time spent running a single probe attempt before success or failure in milliseconds. Shown as millisecond
gcp.run.container.probe_latencies.avg (count)	The average distribution of time spent running a probe before success or failure in milliseconds. Shown as millisecond
gcp.run.container.probe_latencies.samplecount (count)	The sample count for distribution of time spent running a probe before success or failure in milliseconds. Shown as millisecond
gcp.run.container.probe_latencies.sumsqdev (count)	The sum of squared deviation for distribution of time spent running a probe before success or failure in milliseconds. Shown as millisecond
gcp.run.container.startup_latencies.avg (count)	The average distribution of time spent starting a new container instance in milliseconds. Shown as millisecond
gcp.run.container.startup_latencies.samplecount (count)	The sample count for distribution of time spent starting a new container instance in milliseconds. Shown as millisecond
gcp.run.container.startup_latencies.sumsqdev (count)	The sum of squared deviation for distribution of time spent starting a new container instance in milliseconds. Shown as millisecond
gcp.run.infrastructure.cloudsql.connection_latencies.avg (count)	The average distribution of latency in microseconds for connections originating from Cloud Run to CloudSQL. Shown as microsecond
gcp.run.infrastructure.cloudsql.connection_latencies.samplecount (count)	The sample count for distribution of latency in microseconds for connections originating from Cloud Run to CloudSQL. Shown as microsecond
gcp.run.infrastructure.cloudsql.connection_latencies.sumsqdev (count)	The sum of squared deviation for distribution of latency in microseconds for connections originating from Cloud Run to CloudSQL. Shown as microsecond
gcp.run.infrastructure.cloudsql.connection_refused_count (count)	Total number of connections refused originating from Cloud Run to CloudSQL.
gcp.run.infrastructure.cloudsql.connection_request_count (count)	Total number of connection requests originating from Cloud Run to CloudSQL.
gcp.run.infrastructure.cloudsql.received_bytes_count (count)	Number of bytes received by Cloud Run from CloudSQL over the network. Shown as byte
gcp.run.infrastructure.cloudsql.sent_bytes_count (count)	Number of bytes sent by Cloud Run to CloudSQL over the network. Shown as byte
gcp.run.job.completed_execution_count (count)	Number of completed job executions and their result.
gcp.run.job.completed_task_attempt_count (count)	Number of completed task attempts and its corresponding exit result.
gcp.run.job.running_executions (gauge)	Number of running job executions.
gcp.run.job.running_task_attempts (gauge)	Number of running task attempts.
gcp.run.pending_queue.pending_requests (gauge)	Number of pending requests.
gcp.run.request_count (count)	The number of service requests. Shown as request
gcp.run.request_latencies.avg (gauge)	Average distribution of service request times in milliseconds. Shown as millisecond
gcp.run.request_latencies.p95 (gauge)	The 95th percentile distribution of service request times in milliseconds. Shown as millisecond
gcp.run.request_latencies.p99 (gauge)	The 99th percentile distribution of service request times in milliseconds. Shown as millisecond
gcp.run.request_latencies.samplecount (count)	Sample count of the distribution of service request times in milliseconds. Shown as millisecond
gcp.run.request_latencies.sumsqdev (gauge)	Sum of squared deviation of the distribution of service request times in milliseconds. Shown as millisecond
gcp.run.enhanced.cold_start (count)	Number of times a container or function initialized with a cold start.
gcp.run.enhanced.shutdown (count)	Number of times a container or function shutdown.
gcp.run.job.enhanced.task.started (count)	Number of times a task started.
gcp.run.job.enhanced.task.ended (count)	Number of times a task ended.
gcp.run.job.enhanced.task.duration (gauge)	Average duration of one task in the execution. Shown as millisecond