09 · How Spotify Works

Overview

Source: The System Design Newsletter — Neo Kim

Spotify serves 600M+ users streaming 100M+ tracks. Its architecture solves audio streaming, personalized recommendations, offline sync, and social features at planetary scale — all while keeping buffering to near-zero.

Key Concepts

Audio Transcoding — Tracks are encoded at multiple bitrates (24 kbps, 96 kbps, 160 kbps, 320 kbps) and formats (Ogg Vorbis, AAC, MP3) to serve different devices and network conditions.

Adaptive Streaming — Client requests audio chunks and adjusts quality based on available bandwidth (similar to video ABR).

Collaborative Filtering — Recommendation algorithm that surfaces music based on listening patterns of users with similar taste. Powers Discover Weekly.

Offline Sync — Premium users can download tracks for offline playback. Downloaded tracks are DRM-encrypted.

Core Components

Audio Storage — Tracks stored in GCS/object storage. Multiple encoded variants per track.

CDN — Audio chunks cached at edge nodes globally. Most playback served from CDN, not origin.

Streaming Service — Handles audio chunk delivery. Generates time-limited signed URLs for CDN chunks.

Catalog Service — Stores track metadata: title, artist, album, duration, genre, lyrics.

Recommendation Engine — Collaborative filtering + NLP on audio features. Runs batch ML jobs (on Hadoop/Spark) + real-time inference.

Search Service — Elasticsearch-backed search over tracks, artists, albums, playlists.

Playlist Service — Manages user-created and algorithmic playlists (Discover Weekly, Daily Mixes).

Social Service — Friend activity feed, shared playlists, collaborative playlists.

Podcast Service — Separate pipeline for podcast ingestion, hosting, and analytics.

Audio Streaming Flow

User presses play

Client requests track metadata from Catalog Service

Streaming Service generates signed CDN URL for audio chunks

Client fetches first chunk from nearest CDN edge node

Client pre-buffers next N chunks in background

Playback continues seamlessly while buffer refills

Recommendation System (Discover Weekly)

Weekly batch job analyzing 600M+ user listening histories

Collaborative filtering: "users who like X also like Y"

Audio feature analysis (tempo, key, danceability) using ML models

Natural Language Processing on blogs, playlists, and lyrics

Results materialized as personalized playlist every Monday morning

Scale Characteristics

600M+ monthly active users, 100M+ tracks

Audio chunks pre-cached at CDN edge — origin rarely hit for popular tracks

Recommendation batch jobs run on Hadoop/Spark clusters

Event streaming via Kafka for listening event analytics

Key Trade-offs

Decision	Reasoning
Multiple bitrate variants	Adaptive quality across all network conditions
Signed CDN URLs	Security without routing audio through origin servers
Batch recommendations	Weekly freshness acceptable; real-time ML too expensive
DRM for offline	Protects label agreements while enabling premium feature