Amazon S3 Destination

문서 > Observability Pipelines > Destinations > Amazon S3 Destination

이 페이지는 아직 한국어로 제공되지 않습니다. 번역 작업 중입니다.
현재 번역 프로젝트에 대한 질문이나 피드백이 있으신 경우 언제든지 연락주시기 바랍니다.

이용 가능:

Logs

The Amazon S3 destination is in Preview. Contact your account manager for access.

Overview

Use the Amazon S3 destination to send logs in JSON or Parquet format to Amazon S3. See Automatically generated Parquet schema.

You can also route logs to Snowflake using the Amazon S3 destination.

Note: If you want to send logs to an S3 bucket, and later be able to rehydrate them for analysis and investigation in Datadog, use the Datadog Archives destination.

Set up an Amazon S3 bucket

Create an Amazon S3 bucket

Navigate to Amazon S3 buckets.
Click Create bucket.
Enter a descriptive name for your bucket.
Do not make your bucket publicly readable.
Optionally, add tags.
Click Create bucket.

Set up an IAM policy that allows Workers to write to the S3 bucket

Navigate to the IAM console.
Select Policies in the left side menu.
Click Create policy.
Click JSON in the Specify permissions section.

Copy the below policy and paste it into the Policy editor. Replace <MY_BUCKET_NAME_1>/<MY_OPTIONAL_BUCKET_PATH_1> with the information for the S3 bucket you created in the previous section.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "DatadogOPUpload",
            "Effect": "Allow",
            "Action": [
                "s3:PutObject"
            ],
            "Resource": "arn:aws:s3:::<MY_BUCKET_NAME_1>/<MY_OPTIONAL_BUCKET_PATH_1>/*"
        }
    ]
}

Click Next.
Enter a descriptive policy name.
Optionally, add tags.
Click Create policy.

Create an IAM user or role

Create an IAM user or role and attach the policy to it.

Create a service account

Create a service account to use the policy you created above.

Create an IAM user or role

Create an IAM user or role and attach the policy to it.

Create an IAM user or role

Create an IAM user or role and attach the policy to it.

Set up the destination for your pipeline

Set up the Amazon S3 destination and its environment variables when you create a pipeline. The information below is configured in the pipelines UI.

Enter your S3 bucket name. If you configured Log Archives, it’s the name of the bucket you created earlier.
Enter the AWS region the S3 bucket is in.
(Optional) Enter the key prefix.
- Prefixes are useful for partitioning objects. For example, you can use a prefix as an object key to store objects under a particular directory. If using a prefix for this purpose, it must end in / to act as a directory path; a trailing / is not automatically added.
  - See template syntax if you want to route logs to different object keys based on specific fields in your logs.
- Notes:
  - Datadog recommends that you start your prefixes with the directory name and without a lead slash (/). For example, app-logs/ or service-logs/.
  - Do not use the same S3 prefix as a Datadog Archives destination. The Amazon S3 destination writes files in a different format and having both file types in the same prefix can result in rehydration issues.
Select the storage class for your S3 bucket in the Storage Class dropdown menu.
Select the encoding you want to use in the Encoding dropdown menu (JSON or Parquet).
- Note: For Parquet, the schema is generated per batch and can vary. See Automatically generated Parquet schema.
Select a compression algorithm in the Compression - Algorithm dropdown menu. If you selected:
- Parquet: Datadog recommends snappy or a low-compression level if you choose zstd.
- JSON: Datadog recommends gzip.

Optional settings

Batching

Enter a maximum batching size and select the unit (MB or GB) in the dropdown menu. If not configured, the default is 100 MB.
Enter a batching timeout in seconds. If not configured, the default is 900 seconds.

AWS authentication

Select an AWS authentication option. If you are only using the user or role you created earlier for authentication, do not select Assume role. Select Assume role only if the user or role you created earlier needs to assume a different role to access the AWS resource. The assumed role’s permissions must be explicitly defined.
If you select Assume role:

Enter the ARN of the IAM role you want to assume.
- Note: The user or role you created earlier must have permission to assume this role so that the Worker can authenticate with AWS.
(Optional) Enter the assumed role session name and external ID.

Buffering

Toggle the switch to enable Buffering Options. Enable a configurable buffer on your destination to ensure intermittent latency or an outage at the destination doesn’t create immediate backpressure, and allow events to continue to be ingested from your source. Disk buffers can also increase pipeline durability by writing data to disk, ensuring buffered data persists through a Worker restart. See Destination buffers for more information.

If left unconfigured, your destination uses a memory buffer with a capacity of 500 events.
To configure a buffer on your destination:
1. Select the buffer type you want to set (Memory or Disk).
2. Enter the buffer size and select the unit.
  1. Maximum memory buffer size is 128 GB.
  2. Maximum disk buffer size is 500 GB.
3. In the Behavior on full buffer dropdown menu, select whether you want to block events or drop new events when the buffer is full.

Set secrets

These are the defaults used for secret identifiers and environment variables.

Note: If you enter secret identifiers and then choose to use environment variables, the environment variable is the identifier entered and prepended with DD_OP. For example, if you entered PASSWORD_1 for a password identifier, the environment variable for that password is DD_OP_PASSWORD_1.

There are no secret identifiers to configure.

There are no environment variables to configure.

Route logs to Snowflake using the Amazon S3 destination

You can route logs from Observability Pipelines to Snowflake using the Amazon S3 destination by configuring Snowpipe in Snowflake to automatically ingest those logs. Snowpipe continuously monitors your S3 bucket for new files and automatically ingests them into your Snowflake tables, ensuring near real-time data availability for analytics or further processing. When logs are collected by Observability Pipelines, they are written to an S3 bucket. To set this up:

Set up a pipeline to use Amazon S3 as the log destination. Use the configuration detailed in Set up the destination for your pipeline.
Set up Snowpipe in Snowflake. See Automating Snowpipe for Amazon S3 for instructions.

How the destination works

AWS Authentication

The Observability Pipelines Worker uses the standard AWS credential provider chain for authentication. See AWS SDKs and Tools standardized credential providers for more information.

Permissions

The Observability Pipelines Worker requires these policy permissions to send logs to Amazon S3:

s3:PutObject

Automatically generated Parquet schema

The Observability Pipelines Worker collects a batch of events, generates a schema for those events, and then flushes the batch to S3. The schema can vary between batches because the schema is based on the current batch of events only.

Event batching

A batch of events is flushed when one of these parameters is met. See event batching for more information.

Max Events	Max Bytes	Timeout (seconds)
None	100,000,000	900

Amazon S3 Destination

Overview

Set up an Amazon S3 bucket

Create an Amazon S3 bucket

Set up an IAM policy that allows Workers to write to the S3 bucket

Create an IAM user or role

Create a service account

Create an IAM user or role

Create an IAM user or role

Set up the destination for your pipeline

Optional settings

Batching

AWS authentication

Buffering

Set secrets

Route logs to Snowflake using the Amazon S3 destination

How the destination works

AWS Authentication

Permissions

Automatically generated Parquet schema

Event batching

How can I help you today?