logo

Overview

Source: The System Design Newsletter — Neo Kim
Processing payments at scale is one of the hardest distributed systems problems. A single misrouted charge, duplicate transaction, or security breach can cause massive financial and reputational damage. This case study explores how a platform like Uber handles payments securely at massive scale.

Key Concepts

Idempotency — A payment request can be safely retried without risk of duplicate charges. Each request carries a unique idempotency key (UUID). The server tracks processed keys and returns the cached response for duplicates.
Tokenization — Sensitive card data (PAN) is replaced with a non-sensitive token. Raw card numbers never touch application servers — they go directly to the payment processor.
Secure Enclave / HSM — Sensitive cryptographic operations occur in hardware-isolated environments, protecting keys even from the operating system.

Core Components

  • Mobile Client — Captures payment method. Sensitive data is tokenized before leaving the device.
  • Payment Service — Orchestrates the payment flow. Generates idempotency keys and coordinates with downstream services.
  • Payment Provider (PSP) — Third-party processor (e.g., Stripe, Braintree) that handles card network communication.
  • Card Network — Visa, Mastercard, etc. Routes authorization requests to the issuing bank.
  • Issuing Bank — Approves or declines the charge and holds the user's funds.
  • Ledger / Wallet Service — Records double-entry accounting entries for every transaction.
  • Reconciliation Service — Periodically compares internal records against external bank/PSP statements.

Payment Flow (Happy Path)

  1. User initiates payment in the app
  1. App tokenizes card data with the PSP vault
  1. Payment Service creates a payment record with idempotency key
  1. PSP authorizes charge through the card network
  1. Issuing bank approves/declines
  1. Result propagates back; ledger entries written
  1. Confirmation sent to user

Failure Handling

Scenario
Solution
Network timeout before server processes
Retry with same idempotency key
Server processes but response lost
Idempotency key returns cached result
Duplicate user tap
Idempotency prevents double charge
Reconciliation mismatch
Async job flags for manual review

Scale & Reliability Patterns

  • Exactly-once processing via idempotency keys stored in Redis
  • Async event sourcing — every state change is an immutable event
  • Outbox pattern — DB write + message publish happen atomically
  • Dead letter queues — failed payment events are retried with backoff
  • Rate limiting — per-user and per-merchant limits prevent abuse