S3 Batch
Learn how to use S3 Batch Operations to copy large volumes of objects between S3 buckets, including cross-account configurations. This guide will help you migrate data efficiently and scalably.
1. Enable and configure an inventory or manifest (list of objects)
Create an S3 Inventory (optional, but recommended)
- In the Amazon S3 console, select the source bucket.
- In the "Management" (or "Administration") section, create an S3 Inventory that generates a periodic report (CSV or Parquet) of all objects.
- Verify that the inventory includes "Object version" information (if applicable) and "ETag".
Or use a custom manifest (custom CSV)
- Alternatively, you can create your own CSV file with the following structure on each line:
s3://SOURCE_BUCKET_NAME/object1.txt
s3://SOURCE_BUCKET_NAME/object2.txt
...
- Upload this CSV file (manifest) to an S3 bucket that you have access to.
Note
This inventory or CSV is the "manifest" that S3 Batch will use to know which objects to copy.
2. Configure permissions and roles in source and destination accounts
S3 Batch Execution Role
- In the source account, create an IAM Role that allows S3 Batch Operations (service
batchoperations.s3.amazonaws.com) to read the source bucket and write to the destination bucket at the same time. - Make sure the policy includes the
s3:GetObjectaction for the source bucket ands3:PutObjectfor the destination bucket.
Policy on the destination bucket (cross-account)
- If the destination bucket is in another account, add a bucket policy that allows the
s3:PutObjectaction for the ARN of the IAM role from the previous step.
3. Create the S3 Batch Operations Job
Basic Job Configuration
- Enter the S3 console and select Batch Operations in the side menu.
- Create a new Job with the following basic configuration:
- Manifest: Indicate where the inventory or CSV (manifest) containing the list of objects is located.
- Operation: Select Copy.
- Destination Bucket: Choose the destination bucket (in the other account).
- IAM Role: Select the role created for this purpose (step 2.1).
Additional options (optional)
- Storage Class: Select the storage class you want for the destination bucket (Standard, IA, etc.).
- Object Tags: If you want to replicate or modify tags in the process.
- Retention/Legal Hold: If compliance applies.
Review and create Job
- Verify that the settings are correct and launch the Job.
4. Monitor the process
- In the S3 Batch Operations console, locate your Job and review the status.
- Depending on the volume of objects, copying can take from minutes to several hours/days (for millions of files).
- Review progress reports and possible errors (for example, objects with access denied).
5. Validate the transfer
- Object count: Verify that the total number of files in the destination bucket matches what is expected.
- Error logs: Check error output or review CloudTrail/S3 Logs for objects that were not copied correctly.
- Re-run for failed objects: You can generate a new manifest with only the failed objects and launch another Job.
Summary of key points
- Manifest: Properly prepare the inventory or list of objects.
- Permissions: Ensure you have an IAM role with policies that allow
GetObjectin source andPutObjectin destination (cross-account). - Batch Job: Configure the "Copy" operation with the correct role.
- Monitoring: Review S3 Batch logs and reports, and retry failed objects if necessary.