Overview
Source: The System Design Newsletter — Neo Kim
Apache Kafka is a distributed event streaming platform designed for high-throughput, durable, and fault-tolerant message delivery. Originally built at LinkedIn, Kafka powers real-time data pipelines and event-driven architectures at companies like Uber, Netflix, and Airbnb.
Key Concepts
Topic — A named stream of records. Think of it as a category or feed to which records are published.
Partition — Topics are split into partitions for parallelism. Each partition is an ordered, immutable log. Messages within a partition are strictly ordered.
Offset — A unique sequential ID for each message within a partition. Consumers track their position using offsets.
Consumer Group — A group of consumers that collectively consume a topic. Each partition is assigned to exactly one consumer in the group at a time (enabling parallel processing).
Replication — Each partition has one leader and N-1 follower replicas across brokers. Followers replicate the leader's log. On leader failure, a follower is elected.
Core Components
- Producer — Publishes records to a topic. Can choose partition via key hashing or round-robin.
- Broker — A Kafka server that stores partition logs and serves producers/consumers.
- ZooKeeper / KRaft — Coordinates broker metadata, leader elections, and cluster configuration. (KRaft replaces ZooKeeper in newer Kafka versions.)
- Consumer — Reads records from topic partitions. Commits offsets to track progress.
- Schema Registry — Stores Avro/Protobuf schemas for message validation and evolution.
Write Path
- Producer sends record to broker (leader for that partition)
- Leader appends to partition log on disk (sequential write = fast)
- Followers replicate the record
- Leader sends ACK to producer (after N replicas confirm, based on
acksconfig)
Read Path
- Consumer polls broker for new records from its assigned partitions
- Broker returns a batch of records starting at the consumer's committed offset
- Consumer processes records
- Consumer commits updated offset (at-least-once / exactly-once delivery semantics)
Why Kafka Is Fast
- Sequential disk writes — append-only logs avoid random I/O
- Zero-copy — OS-level
sendfile()transfers data from disk to network without copying to user space
- Batching — producers and consumers batch messages to amortize overhead
- Page cache — OS page cache absorbs most reads; Kafka rarely reads from disk for recent messages
Delivery Semantics
Guarantee | Mechanism |
At-most-once | Consumer reads before processing; offset committed before ack |
At-least-once | Offset committed after successful processing |
Exactly-once | Transactional producer + idempotent consumer |
Common Use Cases
- Real-time analytics pipelines
- Event sourcing and CQRS
- Log aggregation
- Change Data Capture (CDC) from databases
- Microservice communication bus