← All Posts
High Level Design Series · Real-World Designs· Part 57 of 70

Design: Uber / Ride-Sharing

Uber connects millions of riders with drivers in real time. Behind a deceptively simple "request a ride" button lies one of the most complex distributed systems ever built: a service that must ingest millions of GPS pings per second, match riders with optimal drivers in under two seconds, compute accurate ETAs over live road graphs, adjust pricing dynamically across thousands of city zones, and orchestrate payments — all while handling 20 million rides per day across 70+ countries.

In this post we'll design the complete ride-sharing system from the ground up: the location ingestion pipeline, the matching algorithm, the trip lifecycle state machine, the H3 hexagonal grid for surge pricing, ETA computation with ML, and the payment authorization flow. We'll look at how Uber's real DISCO dispatch system works and the engineering trade-offs behind every decision.

Requirements

Functional Requirements

Non-Functional Requirements

Scale Estimation

MetricValueDerived Rate
Rides per day20 million~230 rides/sec
Active drivers online~5 million peak
Location updates per driverEvery 3-4 seconds
Location updates (total)~500M/day~1.3M writes/sec peak
Average trip duration15 minutes~3.5M concurrent trips
WebSocket connections~10 million concurrentriders + drivers

High-Level Architecture

The system decomposes into these core services:

ServiceResponsibilityKey Technology
API GatewayAuthentication, rate limiting, routingNGINX / Envoy
Location ServiceIngest & store real-time driver positionsRedis + Geospatial Index
Matching Service (DISCO)Find optimal driver for each ride requestGeohash + Custom Algorithm
Trip ServiceManage ride lifecycle state machinePostgreSQL + Kafka
ETA ServiceCompute ETAs using road graph + MLGraph DB + TensorFlow
Pricing ServiceCompute fare + surge multiplierH3 Grid + Redis
Payment ServiceAuthorize, capture, and settle paymentsStripe / Internal Ledger
Notification ServicePush notifications & WebSocket updatesWebSocket + APNS/FCM
User ServiceRider/driver profiles, ratings, preferencesPostgreSQL
┌─────────────┐     ┌─────────────┐
│  Rider App  │     │ Driver App  │
└──────┬──────┘     └──────┬──────┘
       │ HTTPS/WSS         │ HTTPS/WSS
       ▼                   ▼
┌─────────────────────────────────┐
│          API Gateway            │
│   (Auth, Rate Limit, Route)     │
└──┬────┬────┬────┬────┬────┬─────┘
   │    │    │    │    │    │
   ▼    ▼    ▼    ▼    ▼    ▼
┌────┐┌────┐┌────┐┌────┐┌────┐┌────┐
│Loc.││Trip││DISC││ETA ││Pric││Pay │
│Svc ││Svc ││ O  ││Svc ││Svc ││Svc │
└─┬──┘└─┬──┘└─┬──┘└─┬──┘└─┬──┘└─┬──┘
  │     │     │     │     │     │
  ▼     ▼     ▼     ▼     ▼     ▼
┌─────────────────────────────────┐
│     Redis Cluster (Locations)   │
│     PostgreSQL  (Trips, Users)  │
│     Kafka       (Events)        │
│     S3          (Trip Logs)     │
└─────────────────────────────────┘

Rider & Driver Flows

Rider Flow

The rider's journey through the system:

Open App Enter Pickup & Destination See Fare Estimate + Surge Request Ride Wait for Match Track Driver Approach Driver Arrives In-Trip Tracking Arrive at Destination Pay & Rate

POST /v1/rides/request

{
  "rider_id": "usr_abc123",
  "pickup": { "lat": 37.7749, "lng": -122.4194 },
  "dropoff": { "lat": 37.3382, "lng": -121.8863 },
  "ride_type": "UberX",
  "payment_method_id": "pm_visa_4242"
}

Returns a ride_id and initiates the matching process. The rider receives real-time updates via WebSocket.

Driver Flow

Go Online Send Location Every 3-4s Receive Ride Request Accept / Decline (15s timeout) Navigate to Pickup Confirm Rider Pickup Navigate to Destination Complete Trip Get Paid

POST /v1/drivers/location

{
  "driver_id": "drv_xyz789",
  "lat": 37.7751,
  "lng": -122.4180,
  "heading": 45,
  "speed_kmh": 32,
  "timestamp": 1714500000000,
  "status": "available"
}

Sent every 3-4 seconds by the driver app. This is the highest-throughput endpoint in the entire system (~1.3M writes/sec peak).

Location Service

The location service is the beating heart of Uber. It must ingest, store, and serve millions of GPS coordinates per second with sub-second latency.

Why Redis with Geospatial Index?

Redis provides the GEOADD / GEOSEARCH commands backed by a sorted set where each member's score is a 52-bit geohash. This gives us:

# Driver sends location update → Location Service writes to Redis:

GEOADD drivers:available -122.4180 37.7751 "drv_xyz789"

# When matching needs nearby drivers:
GEOSEARCH drivers:available
  FROMLONLAT -122.4194 37.7749
  BYRADIUS 5 km
  ASC
  COUNT 20
  WITHCOORD WITHDIST

# Returns:
# 1) "drv_xyz789" — 0.15 km — (37.7751, -122.4180)
# 2) "drv_abc456" — 0.82 km — (37.7780, -122.4210)
# 3) "drv_def012" — 1.45 km — (37.7690, -122.4100)
# ...

How Geohash Works

A geohash encodes a (latitude, longitude) pair into a single integer by interleaving the bits of the binary representations of latitude and longitude. Points that are geographically close share a common prefix in their geohash.

Geohash encoding for (37.7749, -122.4194):

Step 1: Encode latitude 37.7749 into binary
  Range [-90, 90] → midpoint 0 → 37.7749 > 0 → bit 1
  Range [0, 90]   → midpoint 45 → 37.7749 < 45 → bit 0
  Range [0, 45]   → midpoint 22.5 → 37.7749 > 22.5 → bit 1
  ... continue for desired precision

Step 2: Encode longitude -122.4194 into binary
  Range [-180, 180] → midpoint 0 → -122.4194 < 0 → bit 0
  Range [-180, 0]   → midpoint -90 → -122.4194 < -90 → bit 0
  Range [-180, -90] → midpoint -135 → -122.4194 > -135 → bit 1
  ... continue

Step 3: Interleave bits (lon, lat, lon, lat, ...)
  = 0, 1, 0, 0, 1, 1, ... → "9q8yyz" (base32)
Geohash precision: A 6-character geohash covers ~1.2 km × 0.6 km. A 7-character geohash covers ~153 m × 153 m. Uber uses geohash prefixes of length 6-7 for driver lookups, which balances query efficiency with spatial precision.

Location Ingestion Pipeline

With 1.3 million location writes per second at peak, we need a robust ingestion pipeline:

Driver App
    │
    ▼ (every 3-4 seconds)
┌──────────────┐
│  API Gateway  │  ← rate limit per driver: 1 update/sec
└──────┬───────┘
       ▼
┌──────────────┐
│ Kafka Topic   │  ← "driver-locations" — partitioned by driver_id
│ (Buffer)      │     ~100 partitions, 3 replicas
└──────┬───────┘
       ▼
┌──────────────┐
│ Location      │  ← Kafka consumers (100+ instances)
│ Consumers     │     Batch writes to Redis
└──────┬───────┘
       ▼
┌──────────────┐
│ Redis Cluster │  ← 50+ shards, each handling ~26K writes/sec
│ (Geospatial)  │     Shard key: geohash prefix (colocate nearby drivers)
└──────────────┘

Multi-Region Sharding

Uber shards the location store by city or region. A ride request in San Francisco only queries the SF Redis shard — it never needs to scan drivers in New York. This partitioning is natural for a ride-sharing system:

Redis sharding strategy:
  Shard key = city_id (e.g., "sf", "nyc", "london")

  drivers:sf:available      → Redis shard 1-5  (SF drivers)
  drivers:nyc:available     → Redis shard 6-10 (NYC drivers)
  drivers:london:available  → Redis shard 11-15 (London drivers)

Benefits:
  ✓ Queries never cross city boundaries
  ✓ Hot cities get more shards
  ✓ Independent scaling per region
  ✓ Failure isolation — NYC outage doesn't affect SF

Matching Algorithm & DISCO

The matching service — internally called DISCO (Dispatch Optimization) at Uber — is the brain of the system. It takes a ride request and finds the best available driver. This is not simply "find the nearest driver." DISCO optimizes for the global system: minimizing total wait time across all riders, not just the current one.

Basic Matching: Radius Search + Ranking

The foundational algorithm works in stages:

function matchDriver(rideRequest):
    pickup = rideRequest.pickup       // (lat, lng)
    rideType = rideRequest.rideType   // "UberX", "Pool", etc.

    // Stage 1: Find candidate drivers within radius
    radius = 5.0  // km, start with 5km
    candidates = redis.GEOSEARCH(
        key = "drivers:{city}:available",
        center = pickup,
        radius = radius,
        count = 20,
        sort = "ASC"  // nearest first
    )

    // Stage 2: Filter by eligibility
    candidates = candidates.filter(driver =>
        driver.rideType.includes(rideType) &&
        driver.rating >= MIN_RATING &&
        driver.vehicleCapacity >= rideRequest.passengers &&
        !driver.hasActiveOffer  // not already offered another ride
    )

    // Stage 3: If too few candidates, expand radius
    if candidates.length < 3:
        radius = 10.0
        candidates = redis.GEOSEARCH(..., radius=10.0, count=50)
        candidates = applyFilters(candidates)

    // Stage 4: Rank by composite score
    for each driver in candidates:
        driver.eta = etaService.compute(driver.location, pickup)
        driver.score = rankDriver(driver, rideRequest)

    // Stage 5: Sort by score, offer to best
    candidates.sortBy(d => d.score, DESC)
    return offerToDriver(candidates[0], rideRequest)

Ranking Function

The ranking score considers multiple factors:

function rankDriver(driver, request):
    // Lower ETA is better (normalize to 0-1)
    etaScore = 1.0 - (driver.eta / MAX_ETA)

    // Closer distance is better
    distScore = 1.0 - (driver.distance / MAX_DISTANCE)

    // Driver rating (4.5-5.0 → 0.9-1.0)
    ratingScore = driver.rating / 5.0

    // Driver acceptance rate (higher is better)
    acceptScore = driver.acceptanceRate

    // Trip completion heading — is the driver heading toward pickup?
    headingScore = cos(angleBetween(driver.heading, bearingTo(driver.loc, request.pickup)))
    headingScore = max(0, headingScore)  // only reward driving toward

    return (
        0.40 * etaScore +
        0.25 * distScore +
        0.15 * ratingScore +
        0.10 * acceptScore +
        0.10 * headingScore
    )
Why ETA > Distance? A driver 2 km away on a highway may arrive in 3 minutes, while a driver 1 km away in gridlock may take 10 minutes. ETA-based ranking produces dramatically better outcomes than pure distance ranking.

The Offer Cascade

When a driver is selected, they receive a push notification with 15 seconds to accept. If they don't respond or decline:

function offerToDriver(driver, ride):
    // Mark driver as "offered" (prevent double-dispatch)
    redis.SET("offer:{driver.id}", ride.id, EX=20)

    // Send push notification
    notify(driver, {
        type: "RIDE_OFFER",
        ride_id: ride.id,
        pickup: ride.pickup,
        dropoff: ride.dropoff,
        fare_estimate: ride.fare,
        timeout: 15  // seconds
    })

    // Wait for response
    response = await withTimeout(15_seconds):
        driver.respondToOffer(ride.id)

    if response == ACCEPTED:
        tripService.createTrip(ride, driver)
        notifyRider(ride.rider, "Driver matched!")
        return SUCCESS

    if response == DECLINED or response == TIMEOUT:
        redis.DEL("offer:{driver.id}")
        // Move to next candidate
        nextDriver = getNextCandidate(ride)
        if nextDriver:
            return offerToDriver(nextDriver, ride)
        else:
            notifyRider(ride.rider, "No drivers available")
            return NO_MATCH

DISCO: Global Optimization

Uber's real DISCO system goes beyond greedy nearest-driver matching. It uses batched matching — collecting ride requests over a short window (2-3 seconds) and solving the assignment problem globally:

DISCO Batched Matching:

Every 2 seconds:
  1. Collect all pending ride requests → R = {r1, r2, ..., rN}
  2. Collect all available drivers     → D = {d1, d2, ..., dM}
  3. Build cost matrix C[i][j] = ETA(driver_j → rider_i)
  4. Solve assignment: minimize total ΣC[i][j]
     subject to: each driver assigned to at most 1 rider
                 each rider assigned to exactly 1 driver (if possible)
  5. This is the Hungarian Algorithm (or auction algorithm for speed)

Why batching beats greedy:
  Rider A at (1,1), Rider B at (2,2)
  Driver X at (1,2), Driver Y at (2,1)

  Greedy (process A first):
    A → X (nearest, dist=1), B → Y (dist=1) → total = 2

  But if A arrived 1 second before B:
    A → Y (dist=√2 ≈ 1.41), B → X (dist=√2 ≈ 1.41) → total = 2.82

  Batched optimal:
    A → X (dist=1), B → Y (dist=1) → total = 2 ✓

  Batching ensures global optimality regardless of arrival order.

▶ Ride Matching Animation

Watch how the system matches a rider with the optimal driver using radius search, filtering, and ranking.

Trip Service & State Machine

The Trip Service manages the complete lifecycle of every ride through a strict state machine. Each state transition is recorded as an event in Kafka, making the entire trip history auditable and replayable.

Trip State Machine

                    ┌──────────────┐
                    │  REQUESTED   │
                    └──────┬───────┘
                           │ driver accepts
                           ▼
                    ┌──────────────┐
          timeout   │   ACCEPTED   │──────────────────┐
         ────────── └──────┬───────┘                  │
                           │ driver starts driving    │
                           ▼                          │
                    ┌──────────────┐                  │
                    │   EN_ROUTE   │                  │
                    └──────┬───────┘                  │
                           │ driver arrives at pickup │
                           ▼                          │
                    ┌──────────────┐                  │
         rider      │   ARRIVED    │  5-min wait      │
         no-show    └──────┬───────┘  timeout ────────┤
         ──────────────────│                          │
                           │ rider gets in            │
                           ▼                          │
                    ┌──────────────┐                  │
                    │ IN_PROGRESS  │                  │
                    └──────┬───────┘                  │
                           │ arrive at destination    │
                           ▼                          ▼
                    ┌──────────────┐          ┌──────────────┐
                    │  COMPLETED   │          │  CANCELLED   │
                    └──────────────┘          └──────────────┘

Trip Data Model

CREATE TABLE trips (
    id              UUID PRIMARY KEY,
    rider_id        UUID NOT NULL REFERENCES users(id),
    driver_id       UUID REFERENCES users(id),  -- NULL until matched
    status          VARCHAR(20) NOT NULL DEFAULT 'REQUESTED',

    -- Locations
    pickup_lat      DECIMAL(9,6) NOT NULL,
    pickup_lng      DECIMAL(9,6) NOT NULL,
    dropoff_lat     DECIMAL(9,6) NOT NULL,
    dropoff_lng     DECIMAL(9,6) NOT NULL,

    -- Timing
    requested_at    TIMESTAMPTZ NOT NULL DEFAULT NOW(),
    accepted_at     TIMESTAMPTZ,
    pickup_at       TIMESTAMPTZ,
    dropoff_at      TIMESTAMPTZ,
    cancelled_at    TIMESTAMPTZ,

    -- Pricing
    ride_type       VARCHAR(20) NOT NULL,
    base_fare       DECIMAL(10,2),
    surge_multiplier DECIMAL(4,2) DEFAULT 1.00,
    final_fare      DECIMAL(10,2),
    distance_km     DECIMAL(8,2),
    duration_min    DECIMAL(8,2),

    -- Payment
    payment_method_id VARCHAR(64),
    payment_intent_id VARCHAR(64),    -- Stripe hold
    payment_status    VARCHAR(20) DEFAULT 'PENDING',

    -- Ratings
    rider_rating    SMALLINT CHECK (rider_rating BETWEEN 1 AND 5),
    driver_rating   SMALLINT CHECK (driver_rating BETWEEN 1 AND 5),

    created_at      TIMESTAMPTZ DEFAULT NOW(),
    updated_at      TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_trips_rider ON trips(rider_id, created_at DESC);
CREATE INDEX idx_trips_driver ON trips(driver_id, created_at DESC);
CREATE INDEX idx_trips_status ON trips(status) WHERE status NOT IN ('COMPLETED', 'CANCELLED');

State Transition Events

Every state change emits a Kafka event consumed by downstream services:

// Kafka topic: "trip-events"
{
  "event_type": "TRIP_ACCEPTED",
  "trip_id": "trip_abc123",
  "driver_id": "drv_xyz789",
  "rider_id": "usr_abc123",
  "timestamp": "2026-04-15T10:23:45Z",
  "metadata": {
    "driver_location": { "lat": 37.7751, "lng": -122.4180 },
    "eta_to_pickup": 240  // seconds
  }
}

Consumers:
  → Notification Service: push "Driver en route, 4 min away" to rider
  → Analytics Service: log matching latency metric
  → Location Service: start tracking driver→pickup route
  → Payment Service: authorize hold on rider's payment method

ETA Computation

Accurate ETAs are critical — they determine matching decisions, rider expectations, and driver earnings. Uber computes ETAs using a layered approach combining graph algorithms with machine learning.

Layer 1: Road Graph + Dijkstra

The foundation is a weighted directed graph of the road network:

Road graph structure:
  Nodes: intersections (~50M globally)
  Edges: road segments between intersections
  Edge weight: travel time = distance / speed_limit

  Shortest path: Dijkstra's algorithm (or A* with haversine heuristic)

  Optimization: Contraction Hierarchies (CH)
    - Precompute shortcuts between important nodes
    - Reduces Dijkstra from O(E log V) to O(k log k) where k << V
    - Typical: 50M nodes → query in ~0.5ms with CH

Layer 2: Live Traffic Overlay

Static road speeds are wildly inaccurate during rush hour. Uber overlays live traffic data from its own drivers:

Every road segment has a real-time speed:

  segment_speeds = {
    "Market_St_1_2":    { speed: 15 kmh, updated: "10:23:42" },
    "Market_St_2_3":    { speed: 8 kmh,  updated: "10:23:41" },  // congestion!
    "Highway_101_45_46":{ speed: 95 kmh, updated: "10:23:43" },
  }

Source: GPS traces from all Uber drivers on the road
Update frequency: every 60 seconds per segment
Storage: Redis (fast reads for routing engine)

ETA with live traffic:
  For each edge in the shortest path:
    travel_time = segment_distance / live_speed[segment]
  total_eta = Σ travel_times + Σ turn_penalties + pickup_overhead

Layer 3: ML-Based Correction

Even with live traffic, graph-based ETAs have systematic errors (traffic lights, school zones, construction). Uber uses a gradient-boosted decision tree model to correct:

Features:
  - Graph ETA (from Layer 1+2)
  - Time of day, day of week
  - Weather (rain → +15% ETA)
  - Historical trip times on this route
  - Current surge level (proxy for congestion)
  - Number of traffic signals on route
  - Special events (concerts, sports games)

Model: XGBoost / LightGBM
  Input:  feature vector → Output: corrected_eta
  Training data: millions of completed trips (actual duration vs. predicted)

Result:
  Graph ETA alone:     ±25% error
  Graph + live traffic: ±15% error
  Graph + traffic + ML: ±8% error  ← production accuracy
ETA at scale: Uber's ETA service handles ~200K requests/second globally. Each request computes routes using Contraction Hierarchies (sub-millisecond), overlays live traffic (Redis lookup), and applies the ML correction model — all in under 50ms end-to-end.

Surge Pricing & H3 Hexagonal Grid

Surge pricing is Uber's mechanism to balance supply and demand in real time. When rider demand exceeds driver supply in an area, prices increase to incentivize more drivers to that area and to reduce demand from price-sensitive riders.

Why Hexagons? The H3 Grid System

Uber developed the H3 hierarchical hexagonal grid system to divide the entire planet into hexagonal cells. Why hexagons instead of squares or other shapes?

PropertySquare GridHexagonal Grid (H3)
Neighbor distanceUnequal (√2 for diagonal)Equal for all 6 neighbors
Edge effects8 neighbors, inconsistent adjacency6 neighbors, uniform adjacency
CoverageTessellates perfectlyApproximately tessellates (small gaps)
Movement modelingDiagonal biasNatural for movement in any direction
Visual representationFamiliarCloser to circles (better for radii)
H3 Resolution Levels (selected):
┌────────────┬───────────────────────┬────────────────────────────┐
│ Resolution │ Hex Edge Length       │ Hex Area                   │
├────────────┼───────────────────────┼────────────────────────────┤
│ 0          │ ~1107.71 km           │ ~4,357,449.42 km²          │
│ 4          │ ~22.61 km             │ ~1,770.35 km²              │
│ 7          │ ~1.22 km              │ ~5.16 km²                  │  ← city zones
│ 8          │ ~0.46 km              │ ~0.74 km²                  │  ← surge pricing
│ 9          │ ~0.17 km              │ ~0.105 km²                 │  ← fine-grained
│ 15         │ ~0.51 m               │ ~0.90 m²                   │
└────────────┴───────────────────────┴────────────────────────────┘

Uber uses Resolution 8 for surge pricing:
  Each cell ≈ 0.74 km² (about 6 city blocks)
  San Francisco ≈ 160 cells at resolution 8
  Perfect granularity for supply/demand zones

H3 Indexing

Every GPS coordinate maps to a unique H3 cell index at each resolution:

import h3

# Convert (lat, lng) to H3 cell at resolution 8
cell = h3.latlng_to_cell(37.7749, -122.4194, 8)
# → '8828308281fffff'

# Get center of a cell
center = h3.cell_to_latlng('8828308281fffff')
# → (37.77488, -122.41941)

# Get all 6 neighboring cells (ring distance 1)
neighbors = h3.grid_ring('8828308281fffff', 1)
# → {'8828308283fffff', '8828308285fffff', ...}

# Get all cells within k rings (disk)
disk = h3.grid_disk('8828308281fffff', 2)
# → 19 cells (center + ring 1 + ring 2)

# Hierarchical: get parent (coarser) and children (finer)
parent = h3.cell_to_parent('8828308281fffff', 7)
children = h3.cell_to_children('8828308281fffff', 9)  # 7 children

Supply/Demand Ratio Computation

Every 30-60 seconds, the pricing service computes the surge multiplier per H3 cell:

function computeSurge(city):
    cells = h3.getCellsForCity(city, resolution=8)

    for each cell in cells:
        // Count supply: available drivers in this cell
        supply = countDriversInCell(cell, status="available")

        // Count demand: ride requests in last 5 minutes
        demand = countRecentRequests(cell, window=5_minutes)

        // Compute raw ratio
        if supply == 0:
            ratio = MAX_SURGE  // no supply → max surge
        else:
            ratio = demand / supply

        // Map ratio to surge multiplier using a curve
        surge = surgeCurve(ratio)

        // Smooth with neighbors to avoid sharp boundaries
        neighborSurges = getNeighborSurges(cell)
        smoothedSurge = 0.6 * surge + 0.4 * avg(neighborSurges)

        // Clamp to allowed range
        smoothedSurge = clamp(smoothedSurge, 1.0, MAX_SURGE)

        redis.SET("surge:{cell}", smoothedSurge, EX=120)

    return surgeMap

// Surge curve: piecewise linear
function surgeCurve(ratio):
    if ratio <= 1.0:  return 1.0     // balanced or oversupply
    if ratio <= 1.5:  return 1.2     // slight demand increase
    if ratio <= 2.0:  return 1.5     // moderate surge
    if ratio <= 3.0:  return 2.0     // high demand
    if ratio <= 5.0:  return 3.0     // very high demand
    return 5.0                        // extreme (concerts, NYE, etc.)

Fare Calculation with Surge

function calculateFare(trip):
    baseRate = getRateCard(trip.city, trip.rideType)
    // e.g., { baseFare: $2.50, perKm: $1.20, perMin: $0.35, bookingFee: $2.00 }

    distanceFare = trip.distance_km * baseRate.perKm
    timeFare     = trip.duration_min * baseRate.perMin

    subtotal = baseRate.baseFare + distanceFare + timeFare

    // Apply surge
    surgeCell = h3.latlng_to_cell(trip.pickup_lat, trip.pickup_lng, 8)
    surge = redis.GET("surge:{surgeCell}") || 1.0

    surgedFare = subtotal * surge

    // Add fees
    totalFare = surgedFare + baseRate.bookingFee

    // Apply minimum fare
    totalFare = max(totalFare, baseRate.minimumFare)

    return {
        baseFare: subtotal,
        surgeMultiplier: surge,
        surgeAmount: surgedFare - subtotal,
        bookingFee: baseRate.bookingFee,
        totalFare: totalFare
    }

▶ Surge Pricing — H3 Hexagonal Grid

Watch supply/demand shift across city zones. Green = balanced, Yellow = moderate demand, Red = high demand. Each hex shows its surge multiplier.

Payment Service

The payment flow must handle the unique challenge of ride-sharing: the final amount is unknown when the trip starts. Uber uses an authorize-then-capture pattern.

Payment Flow

Timeline:
─────────────────────────────────────────────────────────────────
│ Request Ride      │ Trip Starts     │ Trip Ends        │ Settle
│                   │                 │                  │
│ 1. Validate       │ 3. Trip in      │ 4. Calculate     │ 6. Pay
│    payment method │    progress     │    final fare    │    driver
│                   │                 │                  │
│ 2. Auth hold      │                 │ 5. Capture       │ 7. Reconcile
│    (estimated     │                 │    actual amount │
│     fare + 20%)   │                 │    from hold     │
─────────────────────────────────────────────────────────────────

Step-by-Step Payment Flow

// Step 1: When rider requests a ride
function onRideRequested(ride):
    // Validate payment method is active
    paymentMethod = paymentService.getMethod(ride.payment_method_id)
    if !paymentMethod.isValid():
        return error("Invalid payment method")

    // Estimate fare (route distance + current surge)
    estimate = pricingService.estimateFare(ride)
    // e.g., $24.50

    // Step 2: Place authorization hold (estimate + 20% buffer)
    holdAmount = estimate.totalFare * 1.20  // $29.40
    authorization = stripe.paymentIntents.create({
        amount: holdAmount,
        currency: "usd",
        payment_method: paymentMethod.stripeId,
        capture_method: "manual",  // ← don't charge yet!
        metadata: { ride_id: ride.id }
    })

    ride.payment_intent_id = authorization.id
    ride.save()

// Step 5: When trip completes
function onTripCompleted(trip):
    // Calculate actual fare
    actualFare = pricingService.calculateFare(trip)
    // e.g., $22.80 (shorter route than estimated)

    // Capture the actual amount from the held funds
    stripe.paymentIntents.capture(
        trip.payment_intent_id,
        { amount_to_capture: actualFare.totalFare }
    )
    // The remaining hold ($29.40 - $22.80 = $6.60) is released

    trip.final_fare = actualFare.totalFare
    trip.payment_status = "CAPTURED"
    trip.save()

// Step 6: Pay the driver (weekly settlement)
function weeklyDriverPayout(driver):
    trips = getCompletedTrips(driver, thisWeek)
    totalEarnings = sum(trip.final_fare * 0.75 for trip in trips)
    // Driver gets 75%, Uber keeps 25% commission

    uberCommission = sum(trip.final_fare * 0.25 for trip in trips)

    stripe.transfers.create({
        amount: totalEarnings,
        destination: driver.stripeConnectAccount,
        metadata: { week: currentWeek, trip_count: trips.length }
    })
Why authorize-then-capture? If a rider's card is declined after the trip, Uber eats the cost. By placing a hold upfront, the funds are guaranteed. The hold expires after 7 days if not captured, so Uber captures within minutes of trip completion.

Edge Cases

ScenarioHandling
Fare exceeds authorization holdCapture the hold amount, charge the remainder as a separate transaction
Rider cancels after matchCapture cancellation fee ($5-10) from hold, release the rest
Driver cancelsRelease full hold, no charge to rider
Trip route deviates significantlyFlag for manual review; use route-based fare, not meter fare
Card expired between auth and captureAuth is still valid (holds survive expiry during hold period)
Dispute / chargebackAutomated evidence submission with GPS trail + trip receipt
Split fare (Uber feature)Multiple auth holds on multiple cards; capture proportionally

Idempotency & Double-Charge Prevention

// Every payment operation uses an idempotency key
function capturePayment(trip):
    idempotencyKey = "capture:{trip.id}:{trip.final_fare}"

    result = stripe.paymentIntents.capture(
        trip.payment_intent_id,
        { amount_to_capture: trip.final_fare },
        { idempotencyKey: idempotencyKey }
    )

    // If this function is retried (network timeout, crash, etc.),
    // Stripe returns the same result without charging again.
    return result

DISCO: Dispatch Optimization Deep Dive

Uber's DISCO system is one of the most sophisticated real-time optimization engines in production. Let's look at the advanced features beyond basic matching.

Forward Dispatch

When a driver is completing a trip and is 2-3 minutes from drop-off, DISCO can pre-match them with a new rider near the drop-off location:

function forwardDispatch():
    // Find drivers completing trips in the next 3 minutes
    completingSoon = tripService.getTripsCompletingIn(minutes=3)

    for each trip in completingSoon:
        // Predicted drop-off location
        dropoff = trip.dropoff

        // Find pending ride requests near that drop-off
        nearbyRequests = matchingService.getPendingRequests(
            near = dropoff,
            radius = 2.0  // km
        )

        if nearbyRequests.length > 0:
            // Pre-assign (driver hasn't finished current trip yet)
            bestMatch = rankAndSelect(trip.driver, nearbyRequests)
            bestMatch.status = "PRE_MATCHED"
            bestMatch.assigned_driver = trip.driver_id
            bestMatch.eta = trip.timeToCompletion + etaFromDropoffToPickup

Benefits:
  ✓ Reduces rider wait time by 2-3 minutes
  ✓ Reduces driver idle time (higher utilization)
  ✓ Particularly effective in high-demand areas

Supply Positioning

DISCO doesn't just match — it also repositions idle drivers to where demand is predicted to appear:

function repositionDrivers(city):
    // Predict demand for next 15 minutes per H3 cell
    demandForecast = mlModel.predictDemand(city, horizon=15_min)

    // Current supply per cell
    currentSupply = locationService.getSupplyByCell(city)

    // Find undersupplied cells
    for each cell in city.cells:
        deficit = demandForecast[cell] - currentSupply[cell]
        if deficit > THRESHOLD:
            // Find nearest idle drivers in oversupplied neighbor cells
            idleDrivers = findIdleDriversNear(cell, radius=3_km)
            // Send gentle nudge: "Head to downtown for more requests"
            for driver in idleDrivers[:deficit]:
                notify(driver, {
                    type: "REPOSITION_SUGGESTION",
                    destination: cell.center,
                    reason: "High demand expected in this area",
                    incentive: "$3 bonus for next trip from this zone"
                })

Pool Matching (UberPool / UberX Share)

Pool rides add another dimension of complexity — matching multiple riders heading in the same direction into a single vehicle:

function poolMatch(newRequest):
    // Find active pool trips with available seats
    activePoolTrips = tripService.getActivePoolTrips(
        near = newRequest.pickup,
        radius = 1.5  // km
    )

    for each poolTrip in activePoolTrips:
        // Check if adding this rider makes the route efficient
        currentRoute = poolTrip.optimizedRoute
        newRoute = routeOptimizer.addStop(
            currentRoute,
            pickup = newRequest.pickup,
            dropoff = newRequest.dropoff
        )

        detour = newRoute.totalTime - currentRoute.totalTime
        maxDetour = 0.25 * currentRoute.totalTime  // max 25% longer

        if detour <= maxDetour && poolTrip.seats > 0:
            candidates.add({
                trip: poolTrip,
                detour: detour,
                savings: computeFareSavings(newRequest, poolTrip)
            })

    if candidates.length > 0:
        // Pick the pool trip with minimum detour
        best = candidates.sortBy(c => c.detour).first()
        addRiderToPool(best.trip, newRequest)
    else:
        // Start a new pool trip (match with a fresh driver)
        startNewPoolTrip(newRequest)

Real-Time Communication

Uber maintains persistent WebSocket connections with every active rider and driver app. This enables sub-second location updates, instant match notifications, and live trip tracking.

WebSocket Architecture

┌─────────────┐     ┌─────────────┐
│  Rider App  │     │ Driver App  │
│  (WSS)      │     │  (WSS)      │
└──────┬──────┘     └──────┬──────┘
       │                   │
       ▼                   ▼
┌─────────────────────────────────┐
│     WebSocket Gateway Cluster   │  ← 1000+ servers
│     (Sticky sessions via LB)    │     ~10K connections/server
└──────┬──────────────────┬───────┘
       │                  │
       ▼                  ▼
┌──────────────┐  ┌──────────────┐
│ Redis Pub/Sub │  │ Connection   │
│ (Channel per  │  │ Registry     │
│  trip_id)     │  │ (which server│
└──────────────┘  │  has which    │
                  │  user?)       │
                  └──────────────┘

Message flow for location update:
  1. Driver app sends GPS update via WebSocket
  2. Gateway publishes to Redis channel "trip:{trip_id}"
  3. Gateway serving the rider subscribes to that channel
  4. Rider receives driver location in < 500ms

Connection Management

// When driver goes online:
ws.onConnect(driver_id):
    // Register in connection registry
    redis.HSET("ws:connections", driver_id, this_server_id)
    // Subscribe driver to their personal channel
    subscribe("driver:{driver_id}")

// When ride is matched:
function onRideMatched(trip):
    // Create a trip channel
    tripChannel = "trip:{trip.id}"
    // Subscribe both rider and driver to it
    routeToUser(trip.rider_id, { subscribe: tripChannel })
    routeToUser(trip.driver_id, { subscribe: tripChannel })

// When driver location updates during trip:
function onDriverLocation(driver_id, lat, lng):
    trip = getActiveTrip(driver_id)
    redis.PUBLISH("trip:{trip.id}", JSON.stringify({
        type: "DRIVER_LOCATION",
        lat: lat, lng: lng,
        heading: heading,
        eta_seconds: computeETA(lat, lng, trip)
    }))

Data Pipeline & Analytics

Uber generates enormous amounts of data that feed back into improving every aspect of the system.

Event Streaming Architecture

All services emit events to Kafka:

┌──────────┐  ┌──────────┐  ┌──────────┐
│ Location │  │  Trip    │  │ Payment  │
│ Service  │  │ Service  │  │ Service  │
└────┬─────┘  └────┬─────┘  └────┬─────┘
     │             │             │
     ▼             ▼             ▼
┌──────────────────────────────────────┐
│           Apache Kafka               │
│  Topics:                             │
│   driver-locations  (1.3M msg/sec)   │
│   trip-events       (500 msg/sec)    │
│   payment-events    (300 msg/sec)    │
│   surge-updates     (50 msg/sec)     │
└──────┬───────────┬───────────┬───────┘
       │           │           │
       ▼           ▼           ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ Real-Time│ │   Data   │ │    ML    │
│ Dashbrd  │ │   Lake   │ │ Training │
│ (Flink)  │ │ (S3/HDFS)│ │ Pipeline │
└──────────┘ └──────────┘ └──────────┘

Key Metrics Tracked

MetricUse CaseUpdate Frequency
ETA accuracyRetrain ETA ML modelReal-time
Match accept rateTune matching radius & rankingPer minute
Surge effectivenessCalibrate surge curvesPer 5 minutes
Driver utilizationReposition idle driversPer minute
Cancellation rateIdentify UX friction pointsHourly
Payment failure rateTrigger retry or fallbackReal-time
P99 matching latencySLA monitoringReal-time

Fault Tolerance & Reliability

What Happens When Things Fail?

FailureImpactMitigation
Redis shard downLocation queries fail for that cityReplica promotion in <5s; fallback to stale cache
Matching service crashNew rides can't be matchedMultiple replicas; Kafka buffers unmatched requests
Kafka broker downEvent delivery delayed3x replication; producer retries with idempotency
Payment provider outageCan't authorize holdsFallback to secondary provider; allow trip with post-charge
ETA service slowMatching uses stale ETAsCircuit breaker; fall back to distance-based matching
WebSocket gateway crashUsers lose real-time updatesClient auto-reconnects; server-sent events as fallback
Database primary downTrip writes failAutomatic failover to replica; write-ahead log preserves data

Graceful Degradation Hierarchy

Level 0: Everything healthy
  → Full functionality

Level 1: ETA service degraded
  → Use cached ETAs (stale by up to 60s)
  → Match by distance instead of ETA
  → Show "ETA approximate" in rider app

Level 2: Surge pricing unavailable
  → Default to 1.0x (no surge)
  → Accept lower revenue rather than show errors

Level 3: Payment service down
  → Allow trips to proceed (collect payment later)
  → Flag trips as "payment_pending"
  → Process when service recovers

Level 4: Matching service overloaded
  → Shed load: only process UberX (highest volume type)
  → Queue premium rides (Black, SUV) for processing next
  → Increase matching radius to reduce computation

Level 5: Catastrophic multi-service failure
  → Static pricing, basic nearest-driver match
  → "We're experiencing issues" banner in app
  → Preserve trip safety features (911 button, trip sharing)

Summary & Key Takeaways

ComponentTechnologyKey Design Decision
Location StoreRedis GeospatialIn-memory for sub-ms reads; sharded by city
MatchingDISCO (Custom)Batched Hungarian algorithm over greedy nearest
Trip LifecyclePostgreSQL + KafkaState machine with event sourcing for auditability
ETAGraph + MLContraction Hierarchies + live traffic + XGBoost correction
Surge PricingH3 Grid + RedisHexagonal cells for uniform neighbor distances
PaymentStripe / InternalAuthorize-then-capture for unknown final amounts
Real-TimeWebSocket + Redis Pub/SubPer-trip channels for efficient fan-out
Data PipelineKafka → Flink → S3All events streamed for real-time and batch analytics
The fundamental insight of Uber's architecture: Ride-sharing is a real-time marketplace matching problem. Every design decision — from H3 hexagons to DISCO's batched matching to forward dispatch — serves one goal: minimize the time between a rider pressing "Request" and a car arriving at their location, while keeping the system economically sustainable for drivers.