Join us at the Dash conference! July 16-17, NYC

Archives on AWS S3

S3 Archives

Configure your Datadog account to forward all the logs ingested to your own S3 bucket. This allows you to keep in long-term storage all the logs that you used Datadog to aggregate. This guide shows you how to set this up.

Note: only Datadog users with admin status can create, modify, or delete log archive configurations.

Create and Configure an S3 Bucket

  1. Go into your AWS Console and create an S3 bucket to send your archives to. Be careful not to make your bucket publicly readable.

  2. Modify your bucket to grant write-only access to the Datadog AWS user. Do this by editing your bucket’s permissions, and setting the bucket policy with the following content (replace <MY_BUCKET_NAME> with the name of your bucket):

    {
        "Version": "2012-10-17",
        "Statement": [
            {
                "Sid": "AllowDatadogArchivesUploader",
                "Effect": "Allow",
                "Principal": {
                    "AWS": "464622532012"
                },
                "Action": [
                    "s3:PutObject",
                    "s3:PutObjectAcl",
                    "s3:ListMultipartUploadParts",
                    "s3:AbortMultipartUpload"
                ],
                "Resource": "arn:aws:s3:::<MY_BUCKET_NAME>/*"
            }
        ]
    }
    
  3. Go to your Pipelines page in your Datadog application, and select the Add archive option at the bottom. Only Datadog users with admin status can complete this and the following step.

  4. Input your bucket name. Optionally input a prefix directory for all the content of your log archives. Save your archive, and you are finished.

From the moment that your archive settings are successfully configured in your Datadog account, all the logs that Datadog ingests are enriched by your processing Pipelines and subsequently forwarded to your S3 bucket for archiving.

However, after creating or updating your archive configurations, it can take several minutes before the next archive upload is attempted, so you should check back on your S3 bucket in 15 minutes to make sure the archives have successfully been getting uploaded from your Datadog account.

Format of the S3 Archives

The log archives that Datadog forwards to your S3 bucket are in zipped (gzip) JSON format. Under whatever prefix you indicate (or / if there is none), the archives are stored in a directory structure that indicates on what date and at what time the archive files were generated, like so:

/my/s3/prefix/dt=20180515/hour=14/archive_143201.1234.7dq1a9mnSya3bFotoErfxl.json.gz

This directory structure simplifies the process of querying your historical log archives based on their date.

Within the zipped JSON file, each event’s content is formatted as follows:

{
    "_id": "123456789abcdefg",
    "date": "2018-05-15T14:31:16.003Z",
    "host": "i-12345abced6789efg",
    "source": "source_name",
    "service": "service_name",
    "status": "status_level",
    "message": " ... log message content ... ",
    "attributes": { ... log attributes content ... }
}

Note: Archives only include log content which consists of the message, custom attributes, and reserved attributes of your log events. The log tags (metadata that connects your log data to related metrics and traces) are not included.

Multiple S3 Archives

Admins can route specific logs to an archive by adding a query in the archive’s filter field. Logs enter the first archive whose filter they match on, so it is important to order your archives carefully.

For example, if you create a first archive filtered to the env:prod tag and a second archive without any filter (the equivalent of *), all your production logs would go to one S3 bucket/path and the rest would go to the other.

Server Side Encryption (SSE)

To add server side encryption to your S3 log archives, go to the Properties tab in your S3 bucket and select Default Encryption. Select the AES-256 option and Save.