Overview
Source: The System Design Newsletter — Neo Kim
Reddit is a massive community platform hosting millions of subreddits and handling billions of votes, comments, and posts. Its feed system must rank content in real-time by score while supporting high concurrency and global read traffic.
Key Concepts
Karma / Vote Score — Posts and comments have an upvote/downvote score. The score (with time decay) determines ranking in hot/top/new feeds.
Hot Ranking Algorithm — Uses a logarithmic formula combining vote score and post age. Early votes carry more weight, creating a "freshness" bias for new content.
Subreddit — A topic-based community with its own moderators, rules, and content feed.
Feed Fanout — When a post is created, it must appear in multiple feeds (user home feed, subreddit feed, search index).
Core Components
- Post Service — CRUD for posts and comments. Writes to PostgreSQL and queues downstream events.
- Vote Service — Records upvotes/downvotes atomically. Updates post score in Redis.
- Feed Service — Builds personalized home feeds from subscribed subreddits. Two models: push (fanout on write) and pull (fanout on read).
- Ranking Service — Applies hot/top/new/controversial ranking algorithms to subreddit post lists.
- Search Service — Elasticsearch-backed full-text search over posts, comments, subreddits, and users.
- Media Service — Handles image/video uploads. Transcodes video for streaming. CDN-backed delivery.
- Moderation Service — Automated spam detection + human moderator tooling (mod queue, reports).
- Notification Service — Reply notifications, mentions, subreddit activity digests.
Feed Generation Strategies
Strategy | Approach | Best For |
Fanout on Write (Push) | Pre-compute feed on post creation | Users with low follower count, active users |
Fanout on Read (Pull) | Compute feed at request time | High-follower accounts (celebrities), inactive users |
Hybrid | Push for most users, pull for power accounts | Reddit's actual approach |
Vote Flow
- User votes → Vote Service receives request
- Vote stored in PostgreSQL (durable record)
- Post score updated in Redis (sorted set, keyed by subreddit) — fast O(log N) update
- Ranking job re-scores post in hot feed sorted set
- Cache invalidated for affected feed pages
Scale Characteristics
- Billions of votes and comments processed daily
- Redis sorted sets used for real-time hot feed ranking
- Posts CDN-cached with short TTLs; comments fetched dynamically
- PostgreSQL sharded by subreddit ID
- Elasticsearch cluster for cross-subreddit search
Key Trade-offs
Decision | Reasoning |
Redis sorted sets for ranking | O(log N) updates; sub-millisecond range queries |
Hybrid fanout | Pure push doesn't scale for viral posts hitting millions of feeds |
Logarithmic hot score | Prevents old high-vote posts from dominating the feed forever |
PostgreSQL over NoSQL | Complex queries (threaded comments, moderation) need relational model |