logo

Overview

Source: The System Design Newsletter — Neo Kim
Uber's ETA system is one of the most complex real-time computation systems in the industry. It must instantly estimate travel time for billions of trips across hundreds of cities — accounting for live traffic, route complexity, and prediction accuracy that directly impacts rider trust.

Key Concepts

Routing Graph — The road network is modeled as a weighted directed graph. Nodes = road intersections, edges = road segments with travel time weights.
Dijkstra's / A* Algorithm — Classic shortest-path algorithms used to find the optimal route through the road graph.
Contraction Hierarchies (CH) — A preprocessing technique that creates "shortcuts" in the road graph, enabling shortest-path queries in milliseconds even for city-scale graphs.
H3 Geospatial Indexing — Uber's hexagonal grid system (open-sourced). Divides the world into hexagonal cells at multiple resolutions. Used to aggregate traffic patterns and locate drivers.

Core Components

  • Map Service — Maintains the road network graph. Continuously updated from GPS traces and map providers (HERE, OpenStreetMap).
  • Traffic Service — Aggregates real-time speed data from millions of active Uber driver GPS pings. Updates edge weights in the routing graph.
  • Routing Engine — Computes shortest paths using CH + real-time traffic weights. Returns route segments with per-segment ETA.
  • Machine Learning ETA Model — Adjusts the routing engine's raw ETA using historical patterns: time of day, day of week, weather, local events, road closures.
  • Driver Location Service — Tracks all active driver GPS positions in real-time. Powers driver-to-rider matching and ETA to pickup.
  • Supply/Demand Service — Informs ETA predictions with surge pricing and rider wait time estimates.

ETA Computation Flow

  1. Rider requests a trip from location A to B
  1. Routing Engine runs shortest-path query on road graph (with live traffic weights)
  1. Route broken into segments; raw ETA computed (distance / speed per segment)
  1. ML model applies corrections: historical accuracy adjustments, event detection
  1. Final ETA returned to rider (e.g., "8 minutes away")
  1. ETA continuously updated as driver moves (every 5–8 seconds)

Real-Time Traffic Integration

  • Millions of Uber drivers passively report GPS location every few seconds
  • Speed per road segment = distance / time between consecutive GPS pings
  • Speed data aggregated into H3 hexagonal cells
  • Traffic map updated every minute; routing edge weights refreshed continuously

ML ETA Model Features

  • Historical average speed per segment per time-of-day
  • Day of week patterns
  • Weather data
  • Special events (concerts, sports games) detected via anomaly detection
  • Driver behavior (acceleration/deceleration patterns)

Scale Characteristics

  • 19M trips per day ≈ 220 ETA requests/sec (plus continuous updates)
  • Road graphs for 10,000+ cities globally
  • GPS data ingested from millions of drivers simultaneously
  • ETA accuracy target: within 2 minutes of actual arrival time

Key Trade-offs

Decision
Reasoning
Contraction Hierarchies
Query milliseconds on city-scale graphs; worth preprocessing cost
H3 hexagonal indexing
Uniform cell size; better for spatial aggregation than lat/lng grids
ML correction layer
Raw routing ETA misses real-world variability (red lights, construction)
Per-segment granularity
Enables partial re-routing as driver deviates from planned path