logo

Overview

Source: The System Design Newsletter — Neo Kim
YouTube serves over 500 hours of video uploaded every minute, with billions of daily views. It is one of the most demanding video streaming architectures ever built, combining massive storage, adaptive transcoding, and global distribution.

Key Concepts

Transcoding / Encoding — Raw uploaded video is converted into multiple resolutions and formats (H.264, VP9, AV1) so any device can play it. A 1-hour video might produce 10+ variants.
Adaptive Bitrate Streaming (ABR) — The player automatically switches video quality based on available bandwidth. Protocols: HLS (Apple) and MPEG-DASH.
CDN (Content Delivery Network) — Video chunks are cached at edge nodes close to viewers worldwide, reducing latency and origin load.

Core Components

  • Upload Service — Accepts raw video from creators. Stores the original file in object storage (e.g., GCS/S3).
  • Transcoding Pipeline — Distributed workers transcode video in parallel. DAG-based job scheduling allows splitting a single video into chunks transcoded concurrently.
  • Metadata Service — Stores video titles, descriptions, thumbnails, view counts, and recommendations.
  • CDN — Serves video segments from the nearest edge node to the viewer.
  • Recommendation Engine — ML-driven system that determines which videos appear on the homepage and sidebar.
  • Search Service — Indexes video metadata for full-text search.

Video Upload & Processing Flow

  1. Creator uploads raw video file
  1. File stored in blob storage, upload event published to message queue
  1. Transcoding workers consume event, split video into chunks
  1. Each chunk transcoded in parallel into multiple formats/resolutions
  1. Transcoded chunks stored in CDN-backed object storage
  1. Metadata updated: video marked as "available"
  1. CDN pre-warms popular content at edge nodes

Video Streaming Flow

  1. Viewer requests video
  1. Player fetches manifest file (list of available quality segments)
  1. Player fetches video segments from nearest CDN edge
  1. ABR algorithm adjusts quality based on network conditions
  1. Buffer maintained 30–60s ahead for smooth playback

Scale Characteristics

  • 500+ hours of video uploaded per minute
  • Billions of daily views
  • Videos stored in multiple redundant locations globally
  • Thumbnails served via CDN with aggressive caching (TTL hours to days)

Key Design Decisions

Decision
Reasoning
Chunk-based parallel transcoding
Reduces time-to-availability from hours to minutes
Multiple codec support
Device compatibility (old Android vs Apple Silicon)
CDN-first delivery
99% of traffic served from edge, not origin
Separate read/write paths
Upload throughput doesn't degrade streaming