The Observability Pipelines Worker can ingest logs from many different sources. If you have an Amazon S3 bucket that is receiving logs from an external system, such as AWS CloudTrail or CloudWatch, you can configure the Worker to ingest those logs. The setup uses Observability Pipelines Worker’s Amazon S3 source, which requires configuring an Amazon SQS queue to receive event notifications from the S3 bucket. The event notification then informs the Worker to collect the new log events in the S3 bucket.
Create an Amazon SQS topic to receive S3 notifications
In the Amazon SQS console, provision a new queue specific to this configuration. This keeps any changes you make to it separate from any other log analysis tools that you are using.
Click Create queue to provision a new queue specific to this configuration.
Enter a name for the queue.
In the Access policy section, click the Advanced button.
Copy and paste the below example JSON object into the advanced access policy section. It configures the queue and allows the S3 bucket to send event notifications. Replace ${REGION}, ${AWS_ACCOUNT_ID}, ${QUEUE_NAME}, and ${BUCKET_NAME} with the relevant AWS account information, the queue name, and the bucket name you just entered.
In the Amazon S3 console, go to the S3 bucket that is collecting the logs that you want the Worker to ingest.
Click the Properties tab.
Go to the Event notifications section, and click Create event notification.
Enter a name for the event.
In the Event types section, click All object create events. The Worker only responds to object creation events, so those are the only events to which you need to subscribe.
In the Destination section, select SQS queue and then choose the SQS queue you created earlier.
Click Save changes.
The SQS queue should now be receiving messages for the Worker to process.
If you encounter the “Unable to validate the following destination configurations” error, check that the SQS access policy is set up correctly.
Create an IAM role for the Worker
Create a separate IAM role for the Worker so that only the necessary permissions are provided.
Replace ${REGION}, ${AWS_ACCOUNT_ID}, ${QUEUE_NAME}, and ${BUCKET_NAME} with the relevant AWS account information and the queue and bucket names that you are using. You need to further modify the role permissions if you want the role to be attachable to EC2 instances, assumable by users, etc.
Click Next: Tags. Optionally, add tags.
Click Next: Review.
Enter a name for the policy.
Click Create policy.
Apply the role to the running Observability Pipelines process. You can do this by attaching the role to an EC2 instance or assuming a role from a given user profile.
Configure the Worker to receive notifications from the SQS queue
Use the below source configuration example to set up the Worker to: a. Receive the SQS event notifications. b. Read the associated logs in the S3 bucket. c. Emit the logs to the console.
With the Amazon S3 source set up, you can now add transforms to manipulate the data and sinks to output the logs to destinations based on your use case. See Configurations for more information on sources, transforms, and sinks.
Configure the Worker to separate batched Amazon S3 log events
Most services (for example, CloudTrail) send logs to S3 in batches, which means that each event that the Worker receives is composed of multiple logs. In the below example, Records is an array of three log events that are batched together.
Then the merge function collapses the data in .Records to the top level so that each log event is an individual log line. The del function removes the extraneous field.
{"log event 1": "xxx"}
{"log event 2": "xxx"}
{"log event 3": "xxx"}
Further reading
Additional helpful documentation, links, and articles: