Effortlessly Upload Large Files to Amazon S3 with Multipart Upload


Introduction

Amazon S3 offers a robust Multipart Upload feature that enables users to upload large files efficiently by splitting them into smaller parts. This approach enhances reliability, speeds up transfers, and ensures successful uploads even in case of network interruptions.

Why Use Multipart Upload for Large Files?

Uploading large files in a single request can be inefficient and prone to failure. Multipart Upload divides the file into multiple parts, allowing them to be uploaded in parallel. This method optimizes performance and reduces the impact of potential upload failures.

Steps to Upload Large Files to S3 Using Multipart Upload

1. Initiate a Multipart Upload

Start by creating a multipart upload request to Amazon S3. This will return an Upload ID, which is necessary to track and manage the process.

python

import boto3


s3_client = boto3.client("s3")

response = s3_client.create_multipart_upload(Bucket="your-bucket-name", Key="large-file.txt")


upload_id = response["UploadId"]


2. Divide the File into Parts and Upload

Read the large file in chunks and upload each part separately.

python

import os


file_path = "path/to/large-file.txt"

file_size = os.path.getsize(file_path)

part_size = 5 * 1024 * 1024  # 5MB  


parts = []


with open(file_path, "rb") as file:

    part_number = 1

    while chunk := file.read(part_size):

        response = s3_client.upload_part(

            Bucket="your-bucket-name",

            Key="large-file.txt",

            PartNumber=part_number,

            UploadId=upload_id,

            Body=chunk,

        )

        parts.append({"ETag": response["ETag"], "PartNumber": part_number})

        part_number += 1


3. Complete the Multipart Upload

Once all parts are uploaded, finalize the process by sending a complete request.

python

s3_client.complete_multipart_upload(

    Bucket="your-bucket-name",

    Key="large-file.txt",

    UploadId=upload_id,

    MultipartUpload={"Parts": parts},

)

print("Upload successful!")


Key Benefits of Multipart Upload

  • Faster Uploads: Parallel uploads reduce total transfer time.

  • Improved Reliability: If an upload fails, only the affected part needs reuploading.

  • Efficient for Large Files: Ideal for video files, large datasets, and backups.

Conclusion

Using Amazon S3 Multipart Upload is the best approach for handling large file uploads. This method ensures efficiency, reliability, and scalability when working with AWS S3 storage solutions.

Comments

Popular posts from this blog

ECS Deployment Best Practices: Blue/Green with CodePipeline and CodeDeploy

HTTP Basic vs API Key Auth: Best Practices for Secure API Development

Creating BI Solutions: AI/BI Genie Space Authoring Best Practices in Databricks

YouTube Channel