Rehydrating from Archives
Overview
Log Rehydration* enables you to capture log events from customer-owned storage-optimized archives back into Datadog’s search-optimized Log Explorer, so that you can use Datadog to analyze or investigate log events that are either old or were excluded from indexing.
Historical views
With historical views, teams rehydrate archived log events by time frame and query filter to meet specific, unexpected use cases efficiently. By creating historical views with specific queries (for example, over one or more services, URL endpoints, or customer IDs), you can reduce the time and cost involved in rehydrating your logs. This is especially helpful when rehydrating over wider time ranges.
Key features:
- Rehydrate up to 1 billion log events per historical view
- Index exclusion filters do not apply to historical views, so there is no need to modify exclusion filters when you rehydrate from archives
- If you download historical views as a CSV, the data is limited to the last 90 days
Prerequisites
Before you can rehydrate logs from archives, you need to complete the following setup steps:
Archive configuration
You must have an external archive configured to rehydrate data from it. To archive your logs in the available destinations (Amazon S3, Azure Storage, or Google Cloud Storage), see Log Archives.
Permissions and authentication
Datadog requires permission to read from your archives to rehydrate content. Archives must be configured with appropriate authentication:
- S3: Must use role delegation (IAM roles)
- Azure Storage: Must use Azure AD with Storage Blob Data Contributor role
- Google Cloud Storage: Must use service account with Storage Object Viewer role
Only archives with proper authentication are available for rehydrating. For detailed setup instructions, see Cloud-specific permissions.
Rehydrating logs with historical views
- Navigate to the Rehydration page.
- Click New Historical View.
- Select the time period for rehydration.
- Choose the archive you want to rehydrate log events from. Only archives that are configured to use role delegation are available for rehydrating.
- (Optional) Estimate scan size and get the total amount of compressed data that is contained in your archive for the selected time frame.
- Name your historical view. Names must begin with a lowercase letter and can only contain lowercase letters, numbers, and the
-
character. - Set the indexing query using the Log Explorer search syntax. Make sure your logs are archived with their tags if you use tags (such as
env:prod
or version:x.y.z
) in the rehydration query. - Define the log limit (maximum logs to rehydrate). When the limit of the rehydration is reached, log reloading stops, but you still have access to the rehydrated logs.
- Set the retention period of the rehydrated logs. This defines how long rehydrated logs stay searchable. Available retentions are based on your contract, default is 15 days.
- (Optional) Configure completion notifications through integrations with the @handle syntax.
For more information on the rehydration scan size, see Understanding rehydration scan sizes.
Historical views management
Viewing historical view content
From the historical view page:
After selecting “Rehydrate from Archive,” the historical view is marked as “PENDING” until its content is ready to be queried.
After the content is rehydrated, the historical view is marked as “ACTIVE”, and the link in the query column leads to the historical view in the Log Explorer.
From the Log Explorer:
In the Log Explorer, open the Index facet in the index selector. Select the Historical indexes to include in your search.
Canceling ongoing historical views
Cancel ongoing rehydrations from the Rehydration page to stop jobs with the incorrect time ranges or with typos in the indexing query.
Logs that have already been indexed remain queryable until the end of the retention period selected for the historical view. All scanned and indexed logs will still be billed.
Deleting historical views
Historical views remain in Datadog until they exceed the selected retention period, unless you choose to delete them earlier. To delete a historical view manually, select the delete icon at the far right of the view and confirm the action.
The historical view is permanently deleted one day after the deletion is initiated. Until then, the team can cancel the deletion.
Viewing deleted historical views
View deleted historical views for up to 1 year in the past using the View
dropdown menu:
Advanced configuration
Rehydration notifications
Events are triggered automatically when a rehydration starts and finishes. These events are available in your Events Explorer.
You can use the built-in template variables to customize the notification triggered at the end of the rehydration:
Variable | Description |
---|
{{archive}} | Name of the archives used for the rehydration. |
{{from}} | Start of the time range selected for the rehydration. |
{{to}} | End of the time range selected for the rehydration. |
{{scan_size}} | Total size of the files processed during the rehydration. |
{{number_of_indexed_logs}} | Total number of rehydrated logs. |
{{explorer_url}} | Direct link to the rehydrated logs. |
Default limit for historical views
Admins with the Logs Write Archives
permission can configure default controls to ensure efficient use of Log Rehydration* across teams. Click Settings to configure:
Default Rehydration volume limit: Define the default number of logs (in millions) that can be rehydrated per historical view. If the limit is reached, the rehydration automatically stops, but already rehydrated logs remain accessible. Admins can also allow this limit to be overridden during view creation.
Rehydration retention periods: Choose which retention periods are available when creating rehydrations. Only the selected durations (for example, 3, 7, 15, 30, 45, 60, 90, or 180 days) appear in the dropdown menu when selecting how long logs should remain searchable in Datadog.
Cloud-specific permissions
Datadog requires the permission to read from your archives in order to rehydrate content from them. This permission can be changed at any time.
In order to rehydrate log events from your archives, Datadog uses the IAM Role in your AWS account that you configured for your AWS integration. If you have not yet created that Role, follow these steps to do so. To allow that Role to rehydrate log events from your archives, add the following permission statement to its IAM policies. Be sure to edit the bucket names and, if desired, specify the paths that contain your log archives.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "DatadogUploadAndRehydrateLogArchives",
"Effect": "Allow",
"Action": ["s3:PutObject", "s3:GetObject"],
"Resource": [
"arn:aws:s3:::<MY_BUCKET_NAME_1_/_MY_OPTIONAL_BUCKET_PATH_1>/*",
"arn:aws:s3:::<MY_BUCKET_NAME_2_/_MY_OPTIONAL_BUCKET_PATH_2>/*"
]
},
{
"Sid": "DatadogRehydrateLogArchivesListBucket",
"Effect": "Allow",
"Action": "s3:ListBucket",
"Resource": [
"arn:aws:s3:::<MY_BUCKET_NAME_1>",
"arn:aws:s3:::<MY_BUCKET_NAME_2>"
]
}
]
}
Adding role delegation to S3 archives
Datadog only supports rehydrating from archives that have been configured to use role delegation to grant access. Once you have modified your Datadog IAM role to include the IAM policy above, ensure that each archive in your archive configuration page has the correct AWS Account + Role combination.
In order to rehydrate log events from your archives, Datadog uses a service account with the Storage Object Viewer role. You can grant this role to your Datadog service account from the Google Cloud IAM Admin page by editing the service account’s permissions, adding another role, and then selecting Storage > Storage Object Viewer.
Understanding rehydration scan sizes
The query is applied after the files matching the time period are downloaded from your archive. As a result, the rehydration scan size is based on the total volume of logs retrieved from the archive, not the number of logs matching the query. Archive storage is time-based, so queries scoped to specific filters (such as service:A
) still retrieve all logs within the selected time window. This includes logs from other services (such as service:A
and service:B
).
Reducing the date range is the most effective way to limit scan size and minimize cloud data transfer costs, because query filters are applied after data is downloaded
Further reading
Additional helpful documentation, links, and articles:
*Log Rehydration is a trademark of Datadog, Inc.