Skip to main content

Design a Distributed Cache

Understanding the Problem

💾 What is a Distributed Cache?

A distributed cache is a system that stores data as key-value pairs in memory across multiple machines in a network. Unlike single-node caches that are limited by the resources of one machine, distributed caches scale horizontally across many nodes to handle massive workloads. The cache cluster works together to partition and replicate data, ensuring high availability and fault tolerance when individual nodes fail.

Functional Requirements

Core Requirements

  1. Users should be able to set , get , and delete key-value pairs.
  2. Users should be able to configure the expiration time for key-value pairs.
  3. Data should be evicted according to Least Recently Used (LRU) policy.

Below the line (out of scope)

  • Users should be able to configure the cache size.
info

We opted for an LRU eviction policy, but you'll want to ask your interviewer what they're looking for if they weren't explicitly upfront. There are, of course, other eviction policies you could implement, like LFU, FIFO, and custom policies.

Non-Functional Requirements

At this point in the interview, you should ask the interviewer what sort of scale we are expecting. This will have a big impact on your design, starting with how you define the non-functional requirements.

If I were your interviewer, I would say we need to store up to 1TB of data and expect to handle a peak of up to 100k requests per second.

Core Requirements

  1. The system should be highly available. Eventual consistency is acceptable.
  2. The system should support low latency operations (< 10ms for get and set requests).
  3. The system should be scalable to support the expected 1TB of data and 100k requests per second.

Below the line (out of scope)

  • Durability (data persistence across restarts)
  • Strong consistency guarantees
  • Complex querying capabilities
  • Transaction support
tip

Note that I'm making quite a few strong assumptions about what we care about here. Make sure you're confirming this with your interviewer. Chances are you've used a cache before, so you know the plethora of potential trade-offs. Some interviewers might care about durability, for example, just ask.

Pattern: Scaling Reads

Hot key scenarios in distributed caches perfectly demonstrate scaling reads challenges. When millions of users simultaneously request the same viral content, traditional caching assumptions break down.

Learn This Pattern