How to download a folder from S3 using the AWS CLI

The AWS CLI has the required functionality for you to download a folder direclty from an AWS S3 Bucket to your local machine.

To get started, make sure you have the AWS CLI installed and then create a folder such as ~/data on your local machine where you wish to store your S3 Bucket downloads.

Using the aws s3 cp [bucketURI] [localDirPath] command you can download a file directly from an S3 bucket to your local machine, but to make this work with folders or directories we need to also pass the --recursive flag.

This command tells the CLI to recursivly download all files and folders from the location of the S3 Bucket URI to the ~/data directory on our local machine.

aws s3 cp s3://your-s3-bucket/path ~/data --recursive

Performing a dry run

If it’s a large folder with a lot of files you may wish to do a dry run first by passing the --dry-run flag, this will simulate the the download action without any files actually getting transfered, highlighting any issues or errors along the way.

aws s3 cp s3://your-s3-bucket/path ~/data --recursive --dry-run

Filtering file types

By default when downloading with the --recursive flag from the S3 bucket it will include all the files. If you only want to include files of a certain type in your download request it is possible to filter them by using the --exclude and --include flags.

Its important to note that to use the --include flag correctly you have to first exclude all files with the --exclude "*" flag, then add the --include flags for your chosen file types. The order here is important when setting both of these as the filters that appear later in the command have higher precedence.

The example below will download all files recursivly from the specified S3 bucket location that have a .csv extention.

aws s3 cp s3://your-s3-bucket/path ~/data --recursive --exclude "*" --include "*.csv"

To download multiple file types in a single request you can pass additional --include flags like in the example below which downloads both .csv and .xls files.

aws s3 cp s3://your-s3-bucket/path ~/data --recursive --exclude "*" --include "*.csv" --include "*.xls"

References

You can read more about the available flags and options in the offical AWS CLI documentation.

Ben Barber

Written by Ben Barber

A software engineer and data-driven trader from Cambridge, UK. Currently working full-time on developing trading algorithms for the US equity and options markets.

Don't miss the next post!

Subscribe to get future posts via email

Add your email address below to get notified when I publish new posts.

No spam ever. Unsubscribe anytime.