05 · How Kafka Works

Overview

Source: The System Design Newsletter — Neo Kim

Apache Kafka is a distributed event streaming platform designed for high-throughput, durable, and fault-tolerant message delivery. Originally built at LinkedIn, Kafka powers real-time data pipelines and event-driven architectures at companies like Uber, Netflix, and Airbnb.

Key Concepts

Topic — A named stream of records. Think of it as a category or feed to which records are published.

Partition — Topics are split into partitions for parallelism. Each partition is an ordered, immutable log. Messages within a partition are strictly ordered.

Offset — A unique sequential ID for each message within a partition. Consumers track their position using offsets.

Consumer Group — A group of consumers that collectively consume a topic. Each partition is assigned to exactly one consumer in the group at a time (enabling parallel processing).

Replication — Each partition has one leader and N-1 follower replicas across brokers. Followers replicate the leader's log. On leader failure, a follower is elected.

Core Components

Producer — Publishes records to a topic. Can choose partition via key hashing or round-robin.

Broker — A Kafka server that stores partition logs and serves producers/consumers.

ZooKeeper / KRaft — Coordinates broker metadata, leader elections, and cluster configuration. (KRaft replaces ZooKeeper in newer Kafka versions.)

Consumer — Reads records from topic partitions. Commits offsets to track progress.

Schema Registry — Stores Avro/Protobuf schemas for message validation and evolution.

Write Path

Producer sends record to broker (leader for that partition)

Leader appends to partition log on disk (sequential write = fast)

Followers replicate the record

Leader sends ACK to producer (after N replicas confirm, based on acks config)

Read Path

Consumer polls broker for new records from its assigned partitions

Broker returns a batch of records starting at the consumer's committed offset

Consumer processes records

Consumer commits updated offset (at-least-once / exactly-once delivery semantics)

Why Kafka Is Fast

Sequential disk writes — append-only logs avoid random I/O

Zero-copy — OS-level sendfile() transfers data from disk to network without copying to user space

Batching — producers and consumers batch messages to amortize overhead

Page cache — OS page cache absorbs most reads; Kafka rarely reads from disk for recent messages

Delivery Semantics

Guarantee	Mechanism
At-most-once	Consumer reads before processing; offset committed before ack
At-least-once	Offset committed after successful processing
Exactly-once	Transactional producer + idempotent consumer

Common Use Cases

Real-time analytics pipelines

Event sourcing and CQRS

Log aggregation

Change Data Capture (CDC) from databases

Microservice communication bus