logo

Overview

Source: The System Design Newsletter — Neo Kim
Reddit is a massive community platform hosting millions of subreddits and handling billions of votes, comments, and posts. Its feed system must rank content in real-time by score while supporting high concurrency and global read traffic.

Key Concepts

Karma / Vote Score — Posts and comments have an upvote/downvote score. The score (with time decay) determines ranking in hot/top/new feeds.
Hot Ranking Algorithm — Uses a logarithmic formula combining vote score and post age. Early votes carry more weight, creating a "freshness" bias for new content.
Subreddit — A topic-based community with its own moderators, rules, and content feed.
Feed Fanout — When a post is created, it must appear in multiple feeds (user home feed, subreddit feed, search index).

Core Components

  • Post Service — CRUD for posts and comments. Writes to PostgreSQL and queues downstream events.
  • Vote Service — Records upvotes/downvotes atomically. Updates post score in Redis.
  • Feed Service — Builds personalized home feeds from subscribed subreddits. Two models: push (fanout on write) and pull (fanout on read).
  • Ranking Service — Applies hot/top/new/controversial ranking algorithms to subreddit post lists.
  • Search Service — Elasticsearch-backed full-text search over posts, comments, subreddits, and users.
  • Media Service — Handles image/video uploads. Transcodes video for streaming. CDN-backed delivery.
  • Moderation Service — Automated spam detection + human moderator tooling (mod queue, reports).
  • Notification Service — Reply notifications, mentions, subreddit activity digests.

Feed Generation Strategies

Strategy
Approach
Best For
Fanout on Write (Push)
Pre-compute feed on post creation
Users with low follower count, active users
Fanout on Read (Pull)
Compute feed at request time
High-follower accounts (celebrities), inactive users
Hybrid
Push for most users, pull for power accounts
Reddit's actual approach

Vote Flow

  1. User votes → Vote Service receives request
  1. Vote stored in PostgreSQL (durable record)
  1. Post score updated in Redis (sorted set, keyed by subreddit) — fast O(log N) update
  1. Ranking job re-scores post in hot feed sorted set
  1. Cache invalidated for affected feed pages

Scale Characteristics

  • Billions of votes and comments processed daily
  • Redis sorted sets used for real-time hot feed ranking
  • Posts CDN-cached with short TTLs; comments fetched dynamically
  • PostgreSQL sharded by subreddit ID
  • Elasticsearch cluster for cross-subreddit search

Key Trade-offs

Decision
Reasoning
Redis sorted sets for ranking
O(log N) updates; sub-millisecond range queries
Hybrid fanout
Pure push doesn't scale for viral posts hitting millions of feeds
Logarithmic hot score
Prevents old high-vote posts from dominating the feed forever
PostgreSQL over NoSQL
Complex queries (threaded comments, moderation) need relational model