Key Technologies

ZooKeeper

Learn about how you can use ZooKeeper to solve a large number of problems in System Design.

Coordinating distributed systems is hard. While processing power and scaling techniques have evolved dramatically, the fundamental problem remains: how do you orchestrate dozens or hundreds of servers to work together seamlessly? When these machines need to elect leaders, maintain consistent configurations, and detect failures in real time, you face the exact problems that ZooKeeper was designed to solve.

Released in 2008, ZooKeeper has aged, and numerous alternatives have emerged. Nevertheless, it remains central to the Apache ecosystem in particular.

Despite its age, understanding ZooKeeper teaches essential distributed systems concepts that apply even if you never use it directly. By learning how ZooKeeper handles coordination through simple primitives (hierarchical namespace, data nodes, and watches), you gain insights into solving universal problems like consensus, leader election, and configuration management.

Let's walk through how ZooKeeper works, when you should use it, and how it's evolving in today's landscape of distributed systems.

A Motivating Example

To understand why coordination is tough, let's start with an example. Imagine you're building a chat application.

Initially, your chat app runs on a single server. Life is simple. When Alice sends a message to Bob, both users are connected to the same server. The server knows exactly where to deliver the message - it's all in-memory, low latency, no coordination needed.

Single server chat app

Home

Key Technologies

ZooKeeper

A Motivating Example

ZooKeeper Basics

Data Model: ZNodes

Server Roles and Ensemble

Watches: Knowing When Things Change

Key Capabilities

ZooKeeper for Configuration Management

ZooKeeper for Service Discovery

ZooKeeper for Leader Election

ZooKeeper for Distributed Locks

How ZooKeeper Works

Consensus with ZAB

Strong Consistency Guarantees

Read and Write Operations

Sessions and Connection Management

Storage Architecture

Handling Failures

ZooKeeper in the Modern World

Current Usage in Major Distributed Systems

Alternatives to Consider

Limitations

So when should you use ZooKeeper then?

Smart Routing

Certain Infrastructure Design Problems

Durable Distributed Locks

Summary

References