Set-Up AWS FlowLogs to S3
Set-Up AWS Flow Logs to be sent to an S3 Bucket
This section will describe how to configure VPC flow logs that are to be sent to a S3 bucket.
AWS Flow Logs
VPC Flow Logs collects flow log records, consolidates them into log files, and then publishes the log files to the Amazon S3 bucket at 5-minute intervals. Each log file contains flow log records for the IP traffic recorded in the previous five minutes.
The maximum file size for a log file is 75 MB. If the log file reaches the file size limit within the 5-minute period, the flow log stops adding flow log records to it. Then it publishes the flow log to the Amazon S3 bucket, and creates a new log file.
In Amazon S3, the Last modified field for the flow log file indicates the date and time at which the file was uploaded to the Amazon S3 bucket. This is later than the timestamp in the file name, and differs by the amount of time taken to upload the file to the Amazon S3 bucket.
IAM Policy to publish logs to an S3 bucket
In order to publish the flow logs, the IAM pricipal such as a User, needs to have the following policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["logs:CreateLogDelivery", "logs:DeleteLogDelivery"],
"Resource": "*"
}
]
}Bucket permissions
When creating a bucket in S3, the content of the bucket is private and only accessible to the user that created it. Since the bucket content needs to be read by the agent, a policy needs to be added to that user in order to make the content available.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "AWSLogDeliveryWrite",
"Effect": "Allow",
"Principal": {"Service": "delivery.logs.amazonaws.com"},
"Action": "s3:PutObject",
"Resource": "my-s3-arn",
"Condition": {
"StringEquals": {
"s3:x-amz-acl": "bucket-owner-full-control",
"aws:SourceAccount": account_id
},
"ArnLike": {
"aws:SourceArn": "arn:aws:logs:region:account_id:*"
}
}
},
{
"Sid": "AWSLogDeliveryCheck",
"Effect": "Allow",
"Principal": {"Service": "delivery.logs.amazonaws.com"},
"Action": ["s3:GetBucketAcl", "s3:ListBucket"],
"Resource": "arn:aws:s3:::bucket_name",
"Condition": {
"StringEquals": {
"aws:SourceAccount": account_id
},
"ArnLike": {
"aws:SourceArn": "arn:aws:logs:region:account_id:*"
}
}
}
]
}The following must be filled by you:
my-s3-arn, the ARN of the S3 bucket where the logs are storedaccount_id, the account idregion, a specific region or*bucket_name, the name of the bucket where the logs are stored.
Add a VPC Flow Log
- Open the Amazon VPC console at https://console.aws.amazon.com/vpc/
- In the navigation pane, choose Your VPCs.
- Select the checkboxes for one or more VPCs.
- Choose Actions, Create flow log.
- Configure the flow log settings. For more information, see below.
To configure flow log settings using the console
- For Filter, specify the type of IP traffic data to log. Here please select
All
Accepted – Log only accepted traffic.
Rejected – Log only rejected traffic.
All – Log accepted and rejected traffic.-
For Maximum aggregation interval, choose the maximum period of time during which a flow is captured and aggregated into one flow log record.
-
For Destination, choose Send to an S3 bucket.
-
For S3 bucket ARN, specify the Amazon Resource Name (ARN) of an existing Amazon S3 bucket. You can optionally include a subfolder. For example, to specify a subfolder named my-logs in a bucket named my-bucket, use the following ARN:
arn:aws::s3:::my-bucketThe bucket cannot use AWSLogs as a subfolder name, as this is a reserved term.
If you own the bucket, we automatically create a resource policy and attach it to the bucket. For more information, see Amazon S3 bucket permissions for flow logs.
For Log record format, specify the format for the flow log record. Here please select
Customand proceed to select all fields.- To use the default flow log record format, choose AWS default format. - To create a custom format, choose Custom format. For Log format, choose the fields to include in the flow log record. -
For Log file format, specify the format for the log file.
- Text – Plain text. This is the default format. - Parquet – Apache Parquet is a columnar data format. Queries on data in Parquet format are 10 to 100 times faster compared to queries on data in plain text. Data in Parquet format with Gzip compression takes 20 percent less storage space than plain text with Gzip compression. -
(Optional) To use Hive-compatible S3 prefixes, choose Hive-compatible S3 prefix, Enable.
-
To partition your flow logs per hour, choose Every 1 hour (60 mins).
-
(Optional) To add a tag to the flow log, choose Add new tag and specify the tag key and value.
-
Choose Create flow log.
Configuration
Configure the Coordimap AWS VPC Flow Logs data source with S3 log settings, AWS credentials, and crawl intervals for network flow ingestion.
Kubernetes Configuration
Configure Coordimap to crawl Kubernetes clusters with in-cluster or kubeconfig access, crawl intervals, and optional Retina flow telemetry.