A Scalable System for Media Uploads and Serving

Handling media, such as images and videos, is a common requirement for any modern applications. Be it uploading a simple profile picture for a user or building a highly scalable system like Youtube or Instagram where high quality media is the backbone of the service, you will need a scalable and efficient approach for designing the system to handle media.

Building a robust system to manage this can be tricky, especially when dealing with large files and high user traffic. The good news is that with a serverless architecture on a platform like AWS (or any other cloud service), you can create a highly scalable, cost-effective, and reliable solution. Let's break down the key components of such a system.

The Old School Method

The traditional method of file uploads involves the client (e.g., a web browser or mobile app) sending the file directly to your backend server via REST APIs, which then processes and stores it. This approach can be problematic for several reasons:

Server Strain: The backend server must handle the entire file payload, consuming valuable processing power and memory that could be used for other tasks.
Latency & Reliability: Large files can take a long time to upload, increasing the risk of timeouts and connection issues.
Cost: You're paying for compute resources on your server to handle a task that can be easily offloaded to managed services specially designed for this purpose.

Also, most of the services now use some sort of API gateway which has restrictions on the size of payload that you can transfer with API requests. This may easily become a bottleneck if you plan to send large size media files via this route.

Modern Approach

A far more efficient solution, especially for large files, is the direct-to-S3 upload. Instead of proxying the file through your backend, the client requests a temporary, secure link from your backend service. This link, known as a pre-signed URL, grants the client direct permission to upload a specific file to a specific S3 bucket for a limited time. This approach has several key benefits:

Offloads Work from Your Server: The backend only handles the metadata and the generation of the pre-signed URL, not the large file payload itself. This frees up your backend to focus on core business logic.
Scalability: AWS S3 is built to handle massive scale. The service automatically handles the ingestion of file uploads, eliminating any concerns about your server's capacity.
Improved Performance: The client uploads the file directly to S3, which is optimized for high-speed data transfer. This often results in faster and more reliable uploads. You can also support resumable file uploads by chunking the media into several chunks, this also makes the upload process faster and in case of network failures, the client can always resume the file uploads.

The Architecture in Action

The design for this system can be broken down into two main phases: the Upload Flow and the Serving Flow.

The Upload Flow

Client Request: The client initiates an upload by sending a POST (/uploads) request to your API Gateway. This request includes essential metadata about the media file, such as its name, size, and MIME type.
API Gateway & Backend Service: The API Gateway forwards this request to a Backend Service (e.g., an AWS Lambda function or a containerized service). The backend service's job is to:
- Validate the request and ensure the user has permission to upload.
- You can also validate if the requested media upload satisfies the media types you support for uploading.
- Generate a unique filename or key for the S3 object.
- Store the initial metadata about the file in a Database (e.g., DynamoDB or Postgres). This record can include a status field, initially set to PENDING.
- Generate a pre-signed URL for a PUT operation on S3.
- Return the pre-signed URL and the unique MediaId to the client.
Direct Upload: The client uses the pre-signed URL to perform a PUT request, uploading the media file directly to a designated "uploads" S3 bucket.
Event-Driven Processing: This is where the magic of serverless design comes in. When the file is successfully uploaded to the S3 bucket, it triggers an event notification. This event can be used to invoke another service, such as an AWS Lambda function, to process the file. This function can perform tasks like:
- Transcoding videos into different formats and resolutions.
- Generating thumbnails or different sizes of images.
- Scanning for viruses or inappropriate content.
- Updating the record in the Database from PENDING to PROCESSED and adding the new processed file details.
Final Storage: The processed media files are saved to a separate S3 bucket, ready to be served.

💡

We need separate buckets for initial uploads and finally processed media files for several reasons. First, we use a lambda to process media based on an event notification from S3, this can potentially cause an infinite loop if not properly handled. Second, it creates a clear separation of raw files and processed files. You can create a lifecycle event that clears files from uploads bucker on a regular interval, because some files may lie there abandoned i.e. they were uploaded but not linked to any resources in the database.

The Serving Flow

Once the media is processed and ready, serving it is a straightforward process.

Client Request: A client requests the media using its unique MediaId.
CDN Integration: For high-performance delivery, you should serve media through a Content Delivery Network (CDN) like Amazon CloudFront. The CDN caches the media files at edge locations around the globe and also helps preventing your S3 bucket from directly exposing to public requests. There have been so many cases where public buckets made people bankrupt.
Fast Delivery: The first time a user requests a file, the CDN fetches it from the S3 bucket. Subsequent requests from other users in the same geographical area will be served directly from the CDN's cache, resulting in extremely low latency and reduced load on your S3 bucket.
Geo-restriction: Another important use case you can have is you can restrict your media to some geolocations. For example, you can’t watch certain shows in Netflix in certain countries, this can be easily achieved if you use CloudFront.

This architecture creates a powerful, decoupled, and highly scalable pipeline for handling media. By leveraging serverless services, you can build a system that automatically scales with your user base without the burden of managing and maintaining servers.

A Scalable System for Media Uploads and Serving

Shashank Rajak

Sep 1, 2025

5 min read

The Old School Method

Modern Approach

The Architecture in Action

The Upload Flow

The Serving Flow

Thoughts or questions?

Follow me on Medium