Forwarding logs from your Google Cloud environment enables near real-time monitoring of the resources and activities taking place in your organization or folder. You can set up log monitors to be notified of issues, use Cloud SIEM to detect threats, or leverage Watchdog to identify unknown issues or anomalous behavior.
Logs are forwarded by Google Cloud Dataflow using the Datadog Dataflow template. This approach offers batching and compression of your log events before forwarding them to Datadog, which is the most network-efficient way to forward your logs. You can specify which logs are forwarded with inclusion and exclusion filters.
Setup
Quick Start (recommended)
Choose the Quick Start setup method if…
You are setting up log forwarding from Google Cloud for the first time.
You prefer a UI-based workflow and want to minimize the time it takes to create and configure the necessary resources.
You want to automate setup steps in scripts or CI/CD pipelines.
Prerequisite permissions
You must have the following permissions to complete the setup:
After running the script, return to the Google Cloud integration tile.
In the Select Projects section, select the folders and projects to forward logs from. If you select a folder, logs are forwarded from all of its child projects. Note: Only folders and projects that you have the necessary access and permissions for appear in this section. Likewise, folders and projects without a display name do not appear.
In the Dataflow Job Configuration section, specify configuration options for the Dataflow job:
Select deployment settings (Google Cloud region and project to host the created resources—Pub/Sub topics and subscriptions, a log routing sink, a Secret Manager entry, a service account, a Cloud Storage bucket, and a Dataflow job)
Select scaling settings (number of workers and maximum workers)
Select performance settings (maximum number of parallel requests and batch size)
Select execution options
In the Advanced Configuration section, optionally specify the machine type for your Dataflow worker VMs. If no machine type is selected, Dataflow automatically chooses an appropriate machine type based on your job requirements.
Optionally, choose to specify inclusion and exclusion filters using Google Cloud’s logging query language.
Review the steps to be executed in the Complete Setup section. If everything is satisfactory, click Complete Setup.
Terraform
Choose the Terraform setup method if…
You manage infrastructure as code and want to keep the Datadog Google Cloud integration under version control.
You need to configure multiple folders or projects consistently with reusable provider blocks.
You want a repeatable, auditable deployment process that fits into your Terraform-managed environment.
Prerequisite permissions
You must have the following permissions to complete the setup:
In the Select Projects section, select the folders and projects to forward logs from. If you select a folder, logs are forwarded from all of its child projects. Note: Only folders and projects that you have the necessary access and permissions for appear in this section. Likewise, folders and projects without a display name do not appear.
In the Dataflow Job Configuration section, specify configuration options for the Dataflow job:
Select deployment settings (Google Cloud region and project to host the created resources—Pub/Sub topics and subscriptions, a log routing sink, a Secret Manager entry, a service account, a Cloud Storage bucket, and a Dataflow job)
Select scaling settings (maximum workers)
Select performance settings (maximum number of parallel requests and batch size)
Select execution options (Streaming Engine is enabled by default; read more about its benefits)
In the Advanced Configuration section, optionally specify the machine type for your Dataflow worker VMs. If no machine type is selected, Dataflow automatically chooses an appropriate machine type based on your job requirements.
Optionally, choose to specify inclusion and exclusion filters using Google Cloud’s logging query language.
See the instructions on the terraform-gcp-datadog-integration repo to set up and manage the necessary infrastructure through Terraform.
Manual
The instructions in this section guide you through the process of:
Creating a Pub/Sub topic and pull subscription to receive logs from a configured log sink
Creating a custom Dataflow worker service account to provide least privilege to your Dataflow pipeline workers
Creating a log sink to publish logs to the Pub/Sub topic
Creating a Dataflow job using the Datadog template to stream logs from the Pub/Sub subscription to Datadog
You have full control over which logs are sent to Datadog through the logging filters you create in the log sink, including GCE and GKE logs. See Google’s Logging query language page for information about writing filters. For a detailed examination of the created architecture, see Stream logs from Google Cloud to Datadog in the Cloud Architecture Center.
Note: You must enable the Dataflow API to use Google Cloud Dataflow. See Enabling APIs in the Google Cloud documentation for more information.
To collect logs from applications running in GCE or GKE, you can also use the Datadog Agent.
1. Create a Cloud Pub/Sub topic and subscription
Go to the Cloud Pub/Sub console and create a new topic. Select the option Add a default subscription to simplify the setup.
Note: You can also manually configure a Cloud Pub/Sub subscription with the Pull delivery type. If you manually create your Pub/Sub subscription, leave the Enable dead lettering box unchecked. For more details, see Unsupported Pub/Sub features.
Give that topic an explicit name such as export-logs-to-datadog and click Create.
Create an additional topic and default subscription to handle any log messages rejected by the Datadog API. The name of this topic is used within the Datadog Dataflow template as part of the path configuration for the outputDeadletterTopictemplate parameter. When you have inspected and corrected any issues in the failed messages, send them back to the original export-logs-to-datadog topic by running a Pub/Sub to Pub/Sub template job.
Datadog recommends creating a secret in Secret Manager with your valid Datadog API key value, for later use in the Datadog Dataflow template.
Cloud Pub/Subs are subject to Google Cloud quotas and limitations. If the number of logs you have exceeds those limitations, Datadog recommends you split your logs over several topics. See the Monitor the Pub/Sub Log Forwarding section for information on setting up monitor notifications if you approach those limits.
2. Create a custom Dataflow worker service account
The default behavior for Dataflow pipeline workers is to use your project’s Compute Engine default service account, which grants permissions to all resources in the project. If you are forwarding logs from a Production environment, you should instead create a custom worker service account with only the necessary roles and permissions, and assign this service account to your Dataflow pipeline workers.
Go to the Service Accounts page in the Google Cloud console and select your project.
Click CREATE SERVICE ACCOUNT and give the service account a descriptive name. Click CREATE AND CONTINUE.
Add the roles in the required permissions table and click DONE.
roles/pubsub.publisher Allow this service account to publish failed messages to a separate subscription, which allows for analysis or resending the logs.
roles/storage.objectAdmin Allow this service account to read and write to the Cloud Storage bucket specified for staging files.
Note: If you don’t create a custom service account for the Dataflow pipeline workers, ensure that the default Compute Engine service account has the required permissions above.
Choose Cloud Pub/Sub as the destination and select the Cloud Pub/Sub topic that was created for that purpose. Note: The Cloud Pub/Sub topic can be located in a different project.
Choose the logs you want to include in the sink with an optional inclusion or exclusion filter. You can filter the logs with a search query, or use the sample function. For example, to include only 10% of the logs with a severity level of ERROR, create an inclusion filter with severity="ERROR" AND sample(insertId, 0.1).
Click Create Sink.
Note: It is possible to create several exports from Google Cloud Logging to the same Cloud Pub/Sub topic with different sinks.
Give the job a name and select a Dataflow regional endpoint.
Select Pub/Sub to Datadog in the Dataflow template dropdown, and the Required parameters section appears.
a. Select the input subscription in the Pub/Sub input subscription dropdown.
b. Enter the following in the Datadog Logs API URL field:
https://
Note: Ensure that the Datadog site selector on the right of the page is set to your Datadog site before copying the URL above.
c. Select the topic created to receive message failures in the Output deadletter Pub/Sub topic dropdown.
d. Specify a path for temporary files in your storage bucket in the Temporary location field.
Under Optional Parameters, check Include full Pub/Sub message in the payload.
If you created a secret in Secret Manager with your Datadog API key value as mentioned in step 1, enter the resource name of the secret in the Google Cloud Secret Manager ID field.
See Template parameters in the Dataflow template for details on using the other available options:
apiKeySource=KMS with apiKeyKMSEncryptionKey set to your Cloud KMS key ID and apiKey set to the encrypted API key
Not recommended: apiKeySource=PLAINTEXT with apiKey set to the plaintext API key
If you created a custom worker service account, select it in the Service account email dropdown.
Click RUN JOB.
Note: If you have a shared VPC, see the Specify a network and subnetwork page in the Dataflow documentation for guidelines on specifying the Network and Subnetwork parameters.
gcp.pubsub.subscription.num_undelivered_messages for the number of messages pending delivery
gcp.pubsub.subscription.oldest_unacked_message_age for the age of the oldest unacknowledged message in a subscription
Use the metrics above with a metric monitor to receive alerts for the messages in your input and deadletter subscriptions.
Monitor the Dataflow pipeline
Use Datadog’s Google Cloud Dataflow integration to monitor all aspects of your Dataflow pipelines. You can see all your key Dataflow metrics on the out-of-the-box dashboard, enriched with contextual data such as information about the GCE instances running your Dataflow workloads, and your Pub/Sub throughput.