Skip to main content
Version: Self Hosted Turbo

Monitor Amazon S3 Bucket

Overview

Monitoring an Amazon S3 Bucket help you to understand and improve the performance of applications that use Amazon S3. By integrating Amazon S3 with SnappyFlow, you can have insights (Average Bucket Size, Average Number Of Objects, Requests, Bytes Downloaded, etc...) about your S3 bucket. These insights will help you to optimize and use the resources in an efficient way.

note

This integration is applicable only for Linux machine.

Prerequisites

To collect metrics and logs of Amazon S3 bucket, it is necessary to have a Linux VM and an IAM Role with read only access to S3 bucket.

Create a Policy to Access Amazon S3 Bucket
  1. Sign in to the AWS Management Console and open the IAM console at https://console.aws.amazon.com/iam/.

  2. Follow the below steps to create a policy in the IAM console.

    • Navigate to Access Management > Policies

    • In the Policies window, click the Create policy button

    • In the Create policy window, go to the JSON tab

    • Copy and paste the below-mentioned JSON code into the policy editor

      {
      "Version": "2012-10-17",
      "Statement": [
      {
      "Action": [
      s3:ListBucket
      s3:GetObject
      ],
      "Effect": "Allow",
      "Resource": "*"
      }
      ]
      }
    • Click the Next: Tags > Next: Review button

    • In the Review policy window, give the Name and Description (Optional) for the policy and review the list of permissions

    • Click the Create policy button

  3. Attach the policy to a dedicated IAM Role for read-only access.

Configuration in Linux VM to collect S3 Bucket Logs

Customize the S3 bucket log input format

Click here to learn more about Custom Log Parser.

Add the following configuration to the custom-logging-plugin.yaml file.

s3-logs:
  documentType: s3Logs
  inputs:
  - name: s3
    options:
      Bucket: backupData
      Prefix: sales/
      Files: transaction.log,audit.log,*.txt
Region: us-west-2
Exclude_Files: temp.img
      Ignore_older: 1d
      Interval_sec: 300s
      Mem_Buf_Limit: 100m
      Refresh_Object_Listing: 1d
  filters:
  - name: lua
    options:
      script: scripts.lua
      call: addtime_millisecond
  - name: record_modifier
    options:
      record:
        level: "info"

Parameter Definition

Bucket - Bucket consisting files, to be set for monitoring

Prefix - Bucket Prefix location for the files to be monitored

Region - Region of the bucket

Files – Comma separated files to be set for monitoring (Accepts wildcard). Files across multilevel sub-dir. To match prefix /sales-data/year=2020/month=05/day=23/status.txt, /sales-data////.txt.

Exclude_Files - Files within prefix that needs to be excluded for monitoring. It accepts multilevel sub-dir.

Ignore_older - Files to be excluded, whose last modified time falls behind given configuration

Interval_sec - Represents the time at which collection of existing files happen

Mem_Buf_Limit - Maximum runtime memory to be considered while reading s3 objects. Refer Unit Sizes - Fluent Bit: Official Manual to know more about unit specification

Refresh_Object_Listing - Represents the time at which , new wildcard represented files to be looked for.

note

Ignore_older and Refresh_Object_Listing supports (m, h,d) syntax.

m - Month

h- Hours

d- Days

Enable S3 Bucket Logs

To start collecting logs from S3 bucket add the following configuration to the config.yaml file.

logging:
  plugins:
  - name: s3-logs
    enabled: true

View S3 Bucket Logs

Follow the below steps to view the logs collected from Amazon S3 bucket.

  1. Go to the Application tab in SnappyFlow and navigate to your Project > Application > Dashboard.

  2. You can view the logs in the Log Management section.

    note

    Once the configuration is complete, the logs will be available within the Log Management section.

  3. To access the unprocessed data gathered from the plugins, navigate to the Browse data section and choose the Index: Logs, Instance: Endpoint, Plugin: s3-logs, and Document Type: s3Logs.