Monitor Amazon S3 Bucket
Overview
Monitoring an Amazon S3 Bucket help you to understand and improve the performance of applications that use Amazon S3. By integrating Amazon S3 with SnappyFlow, you can have insights (Average Bucket Size, Average Number Of Objects, Requests, Bytes Downloaded, etc...) about your S3 bucket. These insights will help you to optimize and use the resources in an efficient way.
This integration is applicable only for Linux machine.
Prerequisites
To collect metrics and logs of Amazon S3 bucket, it is necessary to have a Linux VM and an IAM Role with read only access to S3 bucket.
Create a Policy to Access Amazon S3 Bucket
Sign in to the AWS Management Console and open the IAM console at https://console.aws.amazon.com/iam/.
Follow the below steps to create a policy in the IAM console.
Navigate to Access Management > Policies
In the Policies window, click the
Create policy
buttonIn the Create policy window, go to the JSON tab
Copy and paste the below-mentioned JSON code into the policy editor
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
s3:ListBucket
s3:GetObject
],
"Effect": "Allow",
"Resource": "*"
}
]
}Click the
Next: Tags
>Next: Review
buttonIn the Review policy window, give the Name and Description (Optional) for the policy and review the list of permissions
Click the
Create policy
button
Attach the policy to a dedicated IAM Role for read-only access.
Configuration in Linux VM to collect S3 Bucket Logs
Customize the S3 bucket log input format
Click here to learn more about Custom Log Parser.
Add the following configuration to the custom-logging-plugin.yaml
file.
s3-logs:
documentType: s3Logs
inputs:
- name: s3
options:
Bucket: backupData
Prefix: sales/
Files: transaction.log,audit.log,*.txt
Region: us-west-2
Exclude_Files: temp.img
Ignore_older: 1d
Interval_sec: 300s
Mem_Buf_Limit: 100m
Refresh_Object_Listing: 1d
filters:
- name: lua
options:
script: scripts.lua
call: addtime_millisecond
- name: record_modifier
options:
record:
level: "info"
Parameter Definition
Bucket - Bucket consisting files, to be set for monitoring
Prefix - Bucket Prefix location for the files to be monitored
Region - Region of the bucket
Files – Comma separated files to be set for monitoring (Accepts wildcard). Files across multilevel sub-dir. To match prefix /sales-data/year=2020/month=05/day=23/status.txt, /sales-data////.txt.
Exclude_Files - Files within prefix that needs to be excluded for monitoring. It accepts multilevel sub-dir.
Ignore_older - Files to be excluded, whose last modified time falls behind given configuration
Interval_sec - Represents the time at which collection of existing files happen
Mem_Buf_Limit - Maximum runtime memory to be considered while reading s3 objects. Refer Unit Sizes - Fluent Bit: Official Manual to know more about unit specification
Refresh_Object_Listing - Represents the time at which , new wildcard represented files to be looked for.
Ignore_older and Refresh_Object_Listing supports (m, h,d)
syntax.
m - Month
h- Hours
d- Days
Enable S3 Bucket Logs
To start collecting logs from S3 bucket add the following configuration to the config.yaml
file.
logging:
plugins:
- name: s3-logs
enabled: true
View S3 Bucket Logs
Follow the below steps to view the logs collected from Amazon S3 bucket.
Go to the Application tab in SnappyFlow and navigate to your Project > Application > Dashboard.
You can view the logs in the Log Management section.
noteOnce the configuration is complete, the logs will be available within the Log Management section.
To access the unprocessed data gathered from the plugins, navigate to the Browse data section and choose the
Index: Logs
,Instance: Endpoint
,Plugin: s3-logs,
andDocument Type: s3Logs
.