Handling Large Blobs

note

📁 Large files like videos, images, and documents need special handling in distributed systems. Instead of shoving gigabytes through your servers, this pattern uses presigned URLs to let clients upload directly to blob storage and download from CDNs. You also get resumable uploads, parallel transfers, and progress tracking - the stuff that separates real systems from toy projects.

The Problem

If you've been studying for system design interviews, you already know that large files belong in blob storage like S3, not in databases. This separation lets storage scale independently from compute and keeps database performance snappy.

Why blob storage?

Databases are great at structured data with complex queries but terrible with large binary objects. A 100MB file stored as a BLOB kills query performance, backup times, and replication. Object stores like S3 are built for this: unlimited capacity, 99.999999999% (11 nines) durability, and per-object pricing. As a general rule of thumb, if it's over 10MB and doesn't need SQL queries, it should probably be in blob storage.

While blob storage solved the storage problem, it didn't solve the data transfer problem. The standard approach routes file bytes through your application servers. A client uploads a 2GB video, the API server receives it, then forwards it to blob storage. Same for downloads - blob storage sends to the API server, which forwards to the client. This works for small files, but breaks down as files get bigger.

The Problem​

The Problem