logo

Overview

Source: The System Design Newsletter — Neo Kim
Twitter's timeline is a classic fanout problem: when someone with 10M followers posts a tweet, how do you get it into 10M people's feeds efficiently? Twitter's architecture evolved from pure fanout-on-write to a sophisticated hybrid model.

Key Concepts

Fanout on Write (Push Model) — When a tweet is posted, it is immediately written into the timeline cache of all followers. Fast reads, slow writes. Expensive for users with millions of followers.
Fanout on Read (Pull Model) — When a user opens their feed, tweets from all followed accounts are fetched and merged on demand. Fast writes, slow reads.
Hybrid Model — Twitter's actual approach: push for most users, pull for celebrities (high follower count). Merges pre-computed timelines with real-time celebrity tweets at read time.

Core Components

  • Tweet Service — Creates and stores tweets. Publishes tweet events to a Fanout Queue.
  • Fanout Service — Consumes tweet events. For each tweet, looks up the follower list and writes the tweet ID into each follower's timeline cache.
  • Timeline Cache (Redis) — Stores each user's home timeline as a list of tweet IDs (not full tweet content). Capped at ~800 tweets per user.
  • Tweet Store (Manhattan DB) — Persistent storage for tweet content, metadata, and media references.
  • Celebrity Detection — Users with >X followers are classified as celebrities. Their tweets are NOT fanned out; instead injected at read time.
  • Media Service — Image/video CDN delivery.
  • Search Index — Real-time tweet indexing for Twitter search and trending topics.

Home Timeline Flow

Tweet Creation:
  1. User posts tweet → Tweet Service persists it
  1. Tweet ID published to Fanout Queue (Kafka)
  1. Fanout Service reads follower list from Social Graph DB
  1. For non-celebrity users: writes tweet ID to each follower's Redis timeline list
  1. For celebrities: tweet ID stored separately; not fanned out
Timeline Read:
  1. User opens app → Timeline Service reads their Redis timeline list (tweet IDs)
  1. For each user they follow who is a celebrity: fetch latest N tweets at read time
  1. Merge celebrity tweets into cached timeline
  1. Hydrate tweet IDs with full tweet content from Tweet Store
  1. Apply algorithmic ranking (reverse-chron or ML-based)
  1. Return top tweets to client

Scale Characteristics

  • 500M tweets per day
  • Fanout for a tweet with 100K followers = 100K Redis writes in seconds
  • Timeline cache holds ~800 tweet IDs per user (~40 bytes each = ~32KB per user)
  • Redis cluster storing timelines for ~300M users

Key Trade-offs

Decision
Reasoning
Hybrid fanout
Pure push fails for Katy Perry (100M+ followers)
Store tweet IDs (not content)
Content is shared; storing full tweets would 100x storage
Redis for timelines
Sub-millisecond list operations; timeline reads are hot path
Algorithmic ranking at read
Engagement prediction needs real-time signals not available at write time