Upload Large files in chunks to AWS-S3 using Nodejs

·

4 min read

Upload Large files in chunks to AWS-S3 using Nodejs

Recently I did this cause sometimes the large file upload takes too much time.

Will show you how to utilize Multer, @aws-sdk version 3, and express.js to upload files to AWS S3 in a more streamlined and efficient manner. By dividing the data into smaller “chunks,” we can ensure a smoother and faster transmission.

This article will provide a detailed, step-by-step guide on how to upload large files to AWS S3 using chunked uploads with Node.js. It will cover all necessary steps, including configuring S3Client, creating a controller for chunked uploads, and monitoring upload progress. By the end of this guide, readers will be able to streamline their file upload process and efficiently manage large file uploads.

First, we need to initialize a node.js project.

npm init --yes

Next, you only need to install the required dependencies, rather than the entire @aws-sdk package.

npm install multer express @aws-sdk/lib-storage @aws-sdk/client-s3

Config File

To begin, let’s focus on the configuration file. Our first order of business is to import the S3Client and specify the appropriate region, accessKeyId, and secretAccessKey, so that it is properly configured.

// config/index.js

const { S3Client } = require("@aws-sdk/client-s3")

module.exports.s3 = new S3Client({
 region: process.env.AWS_REGION,
 credentials: {
  accessKeyId: process.env.AWS_ACCESS_KEY_ID,
  secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY,
 },
})

Controller File

Our next task is to design a controller that will enable us to upload files directly to S3 in smaller chunks, in a parallel fashion. To accomplish this, we will take it one step at a time. Our initial step is to supply the params object to the Upload class, which we will import from the “@aws-sdk/lib-storage” package.

// params for s3 upload
 const params = {
  Bucket: "BUCKET_NAME",
  Key: file.originalname,
  Body: file.buffer,
 }

In this scenario, the parameter “Bucket” indicates the BUCKET_NAME where the file will be stored. The parameter “Key” is used to establish a distinct identifier for the object, which in our case is the original file name. Finally, “Body” is utilized to pass the entire file to be uploaded.

const uploadParallel = new Upload({
   client: s3,
   queueSize: 4, // optional concurrency configuration
   partSize: 5542880, // optional size of each part
   leavePartsOnError: false, // optional manually handle dropped parts
   params,
  })

The following code serves as the fundamental implementation for uploading a file to S3 in parallel, by dividing it into smaller chunks. It requires two mandatory parameters, namely the S3 client (which we have set up in the config/index.js file), and the params object that includes the BUCKET_NAME, Key, and Body of the file to be uploaded.

Additionally, we have the option to set additional parameters such as the queueSize and partSize (ensuring that each chunk is at least 5MB in size).

The S3Client also includes events that allow for progress tracking and notification of completion or failure during the upload process.

// checking progress of upload
  uploadParallel.on("httpUploadProgress", progress => {
   console.log(progress)
  })

  // after completion of upload
  uploadParallel.done().then(data => {
   console.log("upload completed!", { data })
   res.send(data)
  })

Our final controller code is,

// controller/index.js

const { Upload } = require("@aws-sdk/lib-storage")
const { s3 } = require("../config")

// upload file to s3 parallelly in chunks
module.exports.uploadFileController = async (req, res) => {
 const file = req.file
 // params for s3 upload
 const params = {
  Bucket: "BUCKET_NAME",
  Key: file.originalname,
  Body: file.buffer,
 }

 try {
  // upload file to s3 parallelly in chunks
  // it supports min 5MB of file size
  const uploadParallel = new Upload({
   client: s3,
   queueSize: 4, // optional concurrency configuration
   partSize: 5542880, // optional size of each part
   leavePartsOnError: false, // optional manually handle dropped parts
   params,
  })

  // checking progress of upload
  uploadParallel.on("httpUploadProgress", progress => {
   console.log(progress)
  })

  // after completion of upload
  uploadParallel.done().then(data => {
   console.log("upload completed!", { data })
   res.send(data)
  })
 } catch (error) {
  res.send({
   success: false,
   message: error.message,
  })
 }
}

Server.js

Our main server.js file is,

// server.js

const express = require("express")
const multer = require("multer")
const app = express()

// controllers
const { uploadFileController } = require("./controller")

// Set up Multer middleware to handle file uploads
// by default, multer will store files in memory
const upload = multer()

// Handle the file upload
app.post("/upload", upload.single("file"), uploadFileController)

// Start the server
app.listen(5000, () => {
 console.log("Server listening on port 5000")
})

The Multer() middleware is being utilized in this code to handle the transfer of multipart/form-data.

By default, multer stores a file in the memory and cleans it after completing the process.

Up to this point, we have covered the necessary steps to upload a file to AWS S3 in chunks using Node.js.

If you found this article helpful, please let me know in the comments section.

Did you find this article valuable?

Support ramu k by becoming a sponsor. Any amount is appreciated!