Overview
Source: The System Design Newsletter — Neo Kim
WhatsApp serves 2 billion users exchanging 100 billion messages per day. Its architecture must deliver messages reliably in real-time, support offline delivery, ensure end-to-end encryption, and scale massively — all with minimal latency.
Key Concepts
XMPP (Extensible Messaging and Presence Protocol) — WhatsApp's original message transport protocol, extended and optimized for mobile constraints.
End-to-End Encryption (E2EE) — Messages are encrypted on the sender's device and can only be decrypted by the recipient. WhatsApp servers never see plaintext content (Signal Protocol).
Persistent WebSocket / Long Connection — Clients maintain a persistent connection to chat servers for low-latency message delivery.
Offline Message Queue — Messages sent to offline users are stored temporarily on the server until the recipient reconnects.
Core Components
- Chat Server — Manages client connections (WebSocket/XMPP). Routes messages to the correct recipient server.
- Message Queue — Stores messages for offline recipients. Cleared on delivery acknowledgment.
- Presence Service — Tracks which users are online and their "last seen" timestamps.
- Group Service — Manages group membership. For group messages, the server fans out to all members.
- Media Service — Handles upload/download of images, videos, documents. Media stored in object storage, URL sent in the message.
- Push Notification Service — Sends APNs (Apple) or FCM (Google) notifications when the app is backgrounded.
- Key Distribution Center — Manages public keys for E2EE key exchange.
Message Delivery Flow (1-to-1)
- Sender encrypts message with recipient's public key
- Sends encrypted message to chat server via persistent connection
- Server checks if recipient is online:
- Online: delivers directly; recipient sends ACK
- Offline: stores in message queue; sends push notification
- On reconnect, recipient fetches queued messages
- Recipient ACKs each message → server deletes from queue
- Sender receives delivery receipt (single ✓ then double ✓)
Group Message Fan-out
- Server stores one copy of the message
- Maintains a delivery list per group message
- Fans out to each member's connection or offline queue
- Sender gets read receipts only when ALL members have read
Scale Characteristics
- 100 billion messages/day (~1.16M msgs/sec average)
- Erlang-based chat servers handle millions of concurrent connections per node
- Servers are stateless with session affinity via load balancer
- Mnesia (Erlang distributed DB) used for session and routing state
Key Design Decisions
Decision | Reasoning |
Erlang / BEAM VM | Built for massive concurrency and fault tolerance |
E2EE at client | Server never holds plaintext — privacy by design |
ACK-based queue deletion | Guarantees at-least-once delivery |
Media separate from messages | Large files don't block message throughput |