The Observability Pipelines Datadog Archives destination is in beta.
Overview
The Observability Pipelines datadog_archives destination formats logs into a Datadog-rehydratable format and then routes them to Log Archives. These logs are not ingested into Datadog, but are routed directly to the archive. You can then rehydrate the archive in Datadog when you need to analyze and investigate them.
The Observability Pipelines Datadog Archives destination is useful when:
You have a high volume of noisy logs, but you may need to index them in Log Management ad hoc.
You have a retention policy.
For example in this first diagram, some logs are sent to a cloud storage for archiving and others to Datadog for analysis and investigation. However, the logs sent directly to cloud storage cannot be rehydrated in Datadog when you need to investigate them.
In this second diagram, all logs are going to the Datadog Agent, including the logs that went to a cloud storage in the first diagram. However, in the second scenario, before the logs are ingested into Datadog, the datadog_archives destination formats and routes the logs that would have gone directly to a cloud storage to Datadog Log Archives instead. The logs in Log Archive can be rehydrated in Datadog when needed.
Copy the below policy and paste it into the Policy editor. Replace <MY_BUCKET_NAME> and <MY_BUCKET_NAME_1_/_MY_OPTIONAL_BUCKET_PATH_1> with the information for the S3 bucket you created earlier.
Choose the IAM policy you created earlier to attach to the new IAM user.
Click Next.
Optionally, add tags.
Click Create user.
Create access credentials for the new IAM user. Save these credentials as AWS_ACCESS_KEY and AWS_SECRET_ACCESS_KEY.
Create a service account
Create a service account to use the policy you created above. In the Helm configuration, replace ${DD_ARCHIVES_SERVICE_ACCOUNT} with the name of the service account.
Create an IAM user
Create an IAM user and attach the IAM policy you created earlier to it.
Add a query that filters out all logs going through log pipelines so that none of those logs go into this archive. For example, add the query observability_pipelines_read_only_archive, assuming no logs going through the pipeline have that tag added.
Select AWS S3.
Select the AWS Account that your bucket is in.
Enter the name of the S3 bucket.
Optionally, enter a path.
Check the confirmation statement.
Optionally, add tags and define the maximum scan size for rehydration. See Advanced settings for more information.
If the Worker is ingesting logs that are not coming from the Datadog Agent and are routed to the Datadog Archives destination, those logs are not tagged with reserved attributes. This means that you lose Datadog telemetry and the benefits of unified service tagging. For example, say your syslogs are sent to datadog_archives and those logs have the status tagged as severity instead of the reserved attribute of status and the host tagged as hostname instead of the reserved attribute host. When these logs are rehydrated in Datadog, the status for the logs are all set to info and none of the logs will have a hostname tag.
Configuration file
For manual deployments, the sample pipelines configuration file for Datadog includes a sink for sending logs to Amazon S3 under a Datadog-rehydratable format.
In the sample pipelines configuration file, replace AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY with the AWS credentials you created earlier.
In the sample pipelines configuration file, replace AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY with the AWS credentials you created earlier.
In the sample pipelines configuration file, replace AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY with the AWS credentials you created earlier.
In the sample pipelines configuration file, replace AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY with the AWS credentials you created earlier.
Replace ${DD_ARCHIVES_BUCKET} and ${DD_ARCHIVES_REGION} parameters based on your S3 configuration.
(Optional) Add a remap transform to tag all logs going to datadog_archives. a. Click Edit and then Add More in the **Add Transforms. b. Click the Remap tile. c. Enter a descriptive name for the component. d. In the Inputs field, select the source to connect this destination to. e. Add .sender = "observability_pipelines_worker" in the Source section. f. Click Save. g. Navigate back to your pipeline.
Click Edit.
Click Add More in the Add Destination tile.
Click the Datadog Archives tile.
Enter a descriptive name for the component.
Select the sources or transforms to connect this destination to.
In the Bucket field, enter the name of the S3 bucket you created earlier.
Enter aws_s3 in the Service field.
Toggle AWS S3 to enable those specific configuration options.
In the Storage Class field, select the storage class in the dropdown menu.
Set the other configuration options based on your use case.
Click Save.
In the Bucket field, enter the name of the S3 bucket you created earlier.
Enter azure_blob in the Service field.
Toggle Azure Blob to enable those specific configuration options.
Enter the Azure Blob Storage Account connection string.
Set the other configuration options based on your use case.
Click Save.
In the Bucket field, enter the name of the S3 bucket you created earlier.
Enter gcp_cloud_storage in the Service field.
Toggle GCP Cloud Storage to enable those specific configuration options.
Set the configuration options based on your use case.
Click Save.
If you are using Remote Configuration, deploy the change to your pipeline in the UI. For manual configuration, download the updated configuration and restart the worker.
See Rehydrating from Archives for instructions on how to rehydrate your archive in Datadog so that you can start analyzing and investigating those logs.
Further reading
Additional helpful documentation, links, and articles: