Design: Google Maps
Requirements & Scale
Google Maps is one of the most complex consumer products ever built. It stitches together satellite imagery, street-level photography, real-time traffic telemetry, routing algorithms, and geospatial search into a seamless experience used by over 1 billion monthly active users. Let's dissect its architecture.
Functional Requirements
- Map Rendering — Display the world as an interactive, zoomable, pannable map with 22 zoom levels (0–21)
- Routing & Navigation — Compute optimal routes between two or more points across driving, walking, cycling, and transit modes
- ETA Prediction — Estimate arrival time factoring in live traffic, historical patterns, and road conditions
- Live Traffic — Aggregate anonymous location data from billions of mobile devices to compute real-time road segment speeds
- Geocoding — Convert addresses to geographic coordinates (and vice versa)
- POI Search — Search for restaurants, gas stations, landmarks, and other points of interest near a location
- Offline Maps — Download map regions for use without network connectivity
Non-Functional Requirements
- Latency — Map tiles must load in <100 ms; routes computed in <1 s
- Availability — 99.99% uptime (navigation is safety-critical)
- Scale — 1B MAU, ~5 billion map tile requests/day, ~1 billion route requests/day
- Storage — Petabytes of imagery, vector data, and road graphs
- Freshness — Traffic data refreshed every 30–60 seconds; road closures reflected within minutes
Back-of-the-Envelope Estimates
| Metric | Estimate |
|---|---|
| Monthly active users | 1 billion |
| Peak tile requests/sec | ~200K QPS |
| Route requests/sec | ~30K QPS |
| Road graph nodes (global) | ~500 million intersections |
| Road graph edges (global) | ~1 billion road segments |
| Total tile storage (all zoom levels) | ~100 PB (raster); ~5 PB (vector) |
| Location pings (traffic data) | ~100 million devices reporting every 1–3 seconds |
Map Tile Serving
The core visual experience of Google Maps is powered by map tiles — small, square image or vector data chunks that are assembled client-side into a seamless map. Understanding the tile system is key to understanding how Maps scales.
The Tile Pyramid
The entire world map is organized as a tile pyramid (also called a quadtree tiling scheme). At zoom level 0, the entire world fits in a single 256×256 pixel tile. Each subsequent zoom level subdivides every tile into 4 children:
| Zoom Level | Total Tiles | Formula | Ground Resolution (equator) |
|---|---|---|---|
| 0 | 1 | 4⁰ | 156,543 m/px |
| 1 | 4 | 4¹ | 78,271 m/px |
| 2 | 16 | 4² | 39,136 m/px |
| 5 | 1,024 | 4⁵ | 4,891 m/px |
| 10 | 1,048,576 | 4¹⁰ | 153 m/px |
| 15 | ~1.07 billion | 4¹⁵ | 4.77 m/px |
| 18 | ~68.7 billion | 4¹⁸ | 0.60 m/px |
| 21 | ~4.4 trillion | 4²¹ | 0.075 m/px |
The total number of tiles at zoom level z is 4z, which equals 2z × 2z tiles in a grid. Equivalently, there are 2z columns and 2z rows at each level. The cumulative total across all levels 0 through z is:
Total tiles = Σ(i=0 to z) 4^i = (4^(z+1) - 1) / 3
For z=21: (4^22 - 1) / 3 ≈ 5.87 trillion tiles
In practice, not all tiles are pre-rendered. The ocean covers ~71% of Earth's surface, and vast uninhabited areas (deserts, forests) don't need high-zoom tiles. Google selectively renders tiles only where there's meaningful content, reducing the actual count by 80–90% at high zoom levels.
Tile Coordinates: z/x/y
Every tile is uniquely identified by three integers: (z, x, y) where:
- z — zoom level (0–21)
- x — column index (0 to 2z−1, left to right)
- y — row index (0 to 2z−1, top to bottom)
To convert geographic coordinates (latitude, longitude) to a tile index at zoom level z:
x_tile = floor((lon + 180) / 360 × 2^z)
y_tile = floor((1 - ln(tan(lat_rad) + sec(lat_rad)) / π) / 2 × 2^z)
where lat_rad = lat × π / 180
This uses the Web Mercator projection (EPSG:3857), which projects the spherical Earth onto a flat square. The Mercator projection preserves angles (making it ideal for navigation) but distorts areas near the poles — which is why Greenland looks as large as Africa on most web maps.
GET /tiles/{z}/{x}/{y}.png (raster) or GET /tiles/{z}/{x}/{y}.pbf (vector). This makes tiles trivially cacheable at every layer of the CDN.
▶ Tile Pyramid: Zoom Levels & Tile Subdivision
Watch how each zoom level subdivides the world into 4× more tiles. Click "Zoom In" to drill into the tile pyramid.
Vector Tiles vs Raster Tiles
Traditional map tiles are raster tiles — pre-rendered PNG/JPEG images at each zoom level. Modern systems increasingly use vector tiles, which transmit geometric data (points, lines, polygons) and let the client render them:
| Property | Raster Tiles | Vector Tiles |
|---|---|---|
| Data format | PNG/JPEG images (256×256 px) | Protobuf-encoded geometry (MVT) |
| File size (typical) | 20–80 KB per tile | 5–30 KB per tile |
| Zoom behavior | Pixelates between discrete levels | Smooth continuous zoom |
| Client rendering | Trivial (just display image) | Requires GPU / WebGL |
| Styling | Baked in at render time | Dynamic (dark mode, custom styles) |
| Rotation/tilt | Labels rotate awkwardly | Labels stay upright |
| Storage savings | ~100 PB (all zoom levels) | ~5 PB (all zoom levels) |
| Offline download size | Larger (~150 MB per city) | Smaller (~50 MB per city) |
Google Maps switched from raster to vector tiles on the web in 2017, following the mobile apps which had used vector tiles since 2013. The switch reduced bandwidth usage by ~70% while enabling smooth zooming, 3D tilting, and dynamic styling (dark mode, color-coded terrain).
CDN Architecture for Tiles
Tile serving is essentially a static content delivery problem — each tile is an immutable blob identified by (z, x, y). This makes it perfect for a multi-tier caching strategy:
- Client cache — Browser/app caches recently viewed tiles in memory and on disk. Tile URLs include version hashes, enabling aggressive
Cache-Control: immutableheaders. - Edge CDN (PoPs) — Google's global edge network (200+ PoPs) caches hot tiles. Popular tiles (major cities at common zoom levels) stay warm. The CDN intercepts ~95% of tile requests.
- Regional tile servers — For cache misses, regional servers either serve from their local SSD cache or fetch from the tile storage backend.
- Tile rendering pipeline — Cold tiles are rendered on-demand from the canonical map data (vector geometry, labels, styling rules). The rendered tile is then pushed into all cache layers.
Road Network as a Graph
For routing and navigation, the road network is modeled as a weighted directed graph:
- Nodes (vertices) — Intersections, road endpoints, and decision points. The global graph has ~500 million nodes.
- Edges — Road segments connecting two adjacent nodes. ~1 billion edges globally. Each edge is directed (to model one-way streets) and carries multiple weights.
Edge Weights
Each edge stores multiple weight attributes used by different routing modes:
| Weight | Description | Used For |
|---|---|---|
| Distance (meters) | Physical length of road segment | Walking, cycling |
| Free-flow time | Traversal time with zero traffic (distance / speed limit) | Baseline driving |
| Historical time | Average traversal time by day-of-week and time-of-day | ETA prediction |
| Live traffic time | Current traversal time from real-time telemetry | Navigation |
| Road class | Highway, arterial, residential, etc. | Hierarchy-based routing |
| Turn costs | Penalty for left turns, U-turns, traffic signals | Realistic routing |
Graph Storage
The global road graph is too large for a single machine (~500M nodes × ~40 bytes + ~1B edges × ~60 bytes ≈ ~80 GB raw). In practice, the graph is:
- Partitioned by geographic region — Using spatial partitioning (e.g., rectangular bounding boxes or S2 cell hierarchies). Each partition fits in a single server's memory.
- Compressed using adjacency arrays — Instead of adjacency lists with pointers, nodes and edges are stored in contiguous arrays sorted by node ID. This enables cache-friendly traversal and reduces memory overhead by 50%+.
- Replicated for availability — Each partition is replicated across multiple servers and data centers.
- Augmented with preprocessed shortcuts — Contraction Hierarchies add ~2× more edges as "shortcut" edges (see routing section).
Routing Algorithms
Finding the shortest path between two points on a graph with 500M+ nodes is one of the most studied problems in computer science. Let's walk through the evolution from naive to production-grade algorithms.
Dijkstra's Algorithm
The classic shortest-path algorithm. Starting from the source node, it explores outward in order of increasing distance, guaranteeing the shortest path to every node visited:
function dijkstra(graph, source, target):
dist[source] = 0
pq = MinPriorityQueue()
pq.insert(source, 0)
while pq is not empty:
u = pq.extractMin()
if u == target:
return dist[target] // found shortest path
for each edge (u, v, weight) in graph.neighbors(u):
alt = dist[u] + weight
if alt < dist[v]:
dist[v] = alt
prev[v] = u
pq.decreaseKey(v, alt)
return INFINITY // no path exists
Time complexity: O((V + E) log V) with a binary heap. For the global graph, V = 500M, E = 1B. Dijkstra explores nodes in expanding circles from the source — it doesn't "know" where the target is. For a cross-country route, this could explore tens of millions of nodes, taking 10+ seconds on modern hardware. Far too slow for production.
A* Algorithm
A* improves Dijkstra by adding a heuristic function h(v) that estimates the remaining distance from node v to the target. The priority queue is ordered by f(v) = g(v) + h(v) where g(v) is the known distance from source to v:
function aStar(graph, source, target):
gScore[source] = 0
fScore[source] = h(source, target)
openSet = MinPriorityQueue()
openSet.insert(source, fScore[source])
while openSet is not empty:
u = openSet.extractMin()
if u == target:
return reconstructPath(prev, target)
for each edge (u, v, weight) in graph.neighbors(u):
tentative_g = gScore[u] + weight
if tentative_g < gScore[v]:
prev[v] = u
gScore[v] = tentative_g
fScore[v] = tentative_g + h(v, target)
openSet.insertOrUpdate(v, fScore[v])
return NO_PATH
For road networks, the haversine distance (great-circle distance) divided by the maximum road speed makes an excellent admissible heuristic. A* typically explores 10–50× fewer nodes than Dijkstra for the same query, because it focuses the search toward the target instead of expanding uniformly.
However, even A* is too slow for global-scale routing. A cross-continent query (New York → Los Angeles) still explores millions of nodes. We need something fundamentally faster.
Contraction Hierarchies (CH)
Contraction Hierarchies is the breakthrough that makes real-time routing on planetary-scale graphs possible. The core insight: preprocess the graph once; answer queries in microseconds.
Preprocessing Phase
The algorithm assigns an "importance" level to each node and contracts nodes from least to most important:
- Order nodes by importance — Highways intersections are more important than residential street endpoints. The ordering heuristic considers node degree, edge difference, spatial distribution, and road class.
- Contract each node — Remove node v from the graph. For every pair of neighbors (u, w) of v, check if the shortest path u→w goes through v. If so, add a shortcut edge u→w with weight = dist(u,v) + dist(v,w).
- Repeat — Continue contracting nodes in order of increasing importance. Each contraction may add shortcut edges that "jump over" contracted nodes.
// Preprocessing: contract nodes from least to most important
for each node v in order of increasing importance:
for each incoming neighbor u of v:
for each outgoing neighbor w of v:
if shortestPath(u, w) goes through v:
add shortcut edge u → w
weight(u→w) = weight(u→v) + weight(v→w)
remove v from active graph
After preprocessing, the graph is augmented with shortcut edges. A shortcut edge represents a shortest path between its endpoints, potentially spanning many original edges (e.g., a 200 km highway segment compressed into one edge).
Query Phase
The query algorithm runs a bidirectional search: one forward search from the source (only following edges to higher-importance nodes) and one backward search from the target (only following reverse edges to higher-importance nodes):
function chQuery(source, target):
// Forward search: only go UP the hierarchy
forwardDist[source] = 0
forwardPQ.insert(source, 0)
// Backward search: only go UP the hierarchy
backwardDist[target] = 0
backwardPQ.insert(target, 0)
while forwardPQ or backwardPQ not empty:
// Expand forward
if forwardPQ not empty:
u = forwardPQ.extractMin()
for each edge (u, v) where importance(v) > importance(u):
relax(u, v)
// Expand backward
if backwardPQ not empty:
u = backwardPQ.extractMin()
for each reverse edge (u, v) where importance(v) > importance(u):
relax(u, v)
// Check for meeting point
update bestPath if forwardDist[v] + backwardDist[v] is minimal
return bestPath
Because both searches only go "upward" in the hierarchy, they meet at some high-importance node (typically on a major highway). The search space is tiny — typically only 500–2000 nodes even for cross-continent queries.
Performance Comparison
| Algorithm | Preprocessing | Query Time | Nodes Explored (US) | Space Overhead |
|---|---|---|---|---|
| Dijkstra | None | ~3–10 s | ~10–50 million | None |
| A* | None | ~0.5–3 s | ~0.5–5 million | None |
| Contraction Hierarchies | ~30 min (US) | ~0.1–1 ms | ~500–2000 | ~2× edges |
| Highway Hierarchies | ~15 min (US) | ~0.5–2 ms | ~1000–5000 | ~1.5× edges |
Highway Hierarchies
An alternative to Contraction Hierarchies developed by Sanders & Schultes. The idea: recursively identify "highway" edges that appear on shortest paths between distant nodes, then restrict search to highway edges early in the query. It provides similar speedups but with different preprocessing trade-offs. In practice, CH has won broader adoption due to simpler implementation and better worst-case performance.
Handling Live Traffic in Routing
Contraction Hierarchies assume static edge weights, but traffic changes constantly. Google's approach combines:
- Customizable CH (CCH) — Preprocess the graph topology once, but allow edge weights to be swapped in real-time. The hierarchy structure (which shortcuts exist) stays fixed, but the shortcut weights are updated every 30–60 seconds based on current traffic conditions.
- Time-dependent routing — Edge weights are functions of time, not constants. The travel time on a highway at 8 AM is different from midnight. The routing algorithm evaluates edge weights at the estimated arrival time at each node.
- Rerouting — During active navigation, the server continuously re-evaluates the route with updated traffic data. If a faster alternative is found, the driver is prompted to reroute.
▶ Route Computation: A* vs Contraction Hierarchies
Step through A* pathfinding on a road network graph. See how the algorithm explores nodes (blue frontier) to find the shortest path (green). Then see how Contraction Hierarchies dramatically reduce the search space.
ETA Prediction
Estimating arrival time is far more than summing up edge traversal times. Google Maps' ETA system combines three data sources through a machine learning model:
Data Sources
- Historical traffic data — Years of aggregated driving data, segmented by road, day-of-week, time-of-day, and even weather conditions. Stored as speed profiles: for each road segment, a 2D lookup table mapping (day, time) → expected speed. At 15-minute granularity, that's 7 days × 96 intervals = 672 speed values per road segment × 1B segments = ~2.7 TB of historical data.
- Real-time traffic — Current speeds on each road segment, computed from live device telemetry (see next section). This captures events not in the historical data: accidents, construction, special events.
- ML prediction model — A deep learning model (likely a Graph Neural Network or spatio-temporal transformer) that takes historical patterns + current traffic state and predicts traffic conditions for the next 30–60 minutes along the route. This is critical because conditions at the time the driver reaches a future road segment matter, not conditions right now.
ETA Computation Pipeline
1. Compute route segments: S1 → S2 → ... → Sn
2. For each segment Si:
a. t_arrive_i = cumulative travel time from start
b. predicted_speed_i = ML_model(
segment=Si,
current_time + t_arrive_i,
current_traffic_state,
historical_profile[Si][day][time],
weather, events
)
c. segment_time_i = length(Si) / predicted_speed_i
3. ETA = Σ segment_time_i + Σ turn_delays + Σ signal_delays
4. Confidence interval: ETA ± uncertainty
(wider for longer trips and unpredictable routes)
ETA Accuracy
Google reports that ETA predictions are within ±10% of actual travel time for most trips. The model is continuously retrained on fresh data. Key challenges include:
- Incidents — Accidents are unpredictable but have massive impact. The model learns to add uncertainty buffers for accident-prone segments.
- Weather — Rain reduces average speeds by 10–25%, snow by 25–50%. Weather forecasts are integrated as model features.
- Events — Stadium traffic, concert dispersal, holiday patterns. Some are predictable (Super Bowl) and pre-modeled; others require real-time detection.
- Construction zones — Speed reductions and lane closures. Detected through repeated slowdowns at the same location and confirmed through map reports.
Live Traffic System
The real-time traffic overlay is perhaps the most remarkable data engineering achievement in Google Maps. Here's how anonymous location data from hundreds of millions of devices becomes the green/yellow/red traffic overlay you see on the map.
Data Collection
When users opt in to location sharing (via Google Maps or Android location services), their devices periodically report:
- GPS coordinates — Latitude/longitude, accurate to ~3–5 meters
- Timestamp — When the reading was taken
- Speed — From GPS Doppler or accelerometer
- Heading — Direction of travel
- Transport mode — Driving, walking, cycling (inferred by activity recognition)
With ~100 million devices reporting every 1–3 seconds while in active navigation, this generates approximately ~50–100 million location events per second — one of the highest-throughput data pipelines in existence.
Map Matching
Raw GPS points don't fall neatly on roads. Map matching is the process of snapping GPS traces to the road network:
- Candidate generation — For each GPS point, find all road segments within a radius (typically 50 m). Use a spatial index (R-tree or S2 cells) for fast lookups.
- Hidden Markov Model (HMM) — Model the sequence of GPS points as observations from a hidden state sequence of road segments. The emission probability depends on GPS-to-road distance; the transition probability depends on route feasibility between consecutive road segments.
- Viterbi algorithm — Decode the most likely sequence of road segments the device traveled on. This handles tunnels (GPS dropout), urban canyons (multipath errors), and parallel roads elegantly.
Speed Aggregation Pipeline
GPS Traces (100M events/sec)
↓
Map Matching (HMM + Viterbi)
↓
Per-segment speed observations
↓
Stream Processing (Flink / Dataflow)
- Group by road segment ID
- Window: 30–60 seconds
- Aggregate: median speed, percentile speeds
- Outlier removal: discard speeds > 200 km/h or < 1 km/h
↓
Segment Speed Table
{segment_id → {speed, sample_count, confidence, timestamp}}
↓
Traffic Color Assignment:
- Green: speed ≥ 80% of free-flow speed
- Yellow: speed 40–80% of free-flow speed
- Red: speed 20–40% of free-flow speed
- Dark Red: speed < 20% of free-flow speed
↓
Tile Overlay Rendering → CDN
Privacy Protections
Traffic data collection raises significant privacy concerns. Google implements several safeguards:
- Anonymization — Location data is stripped of user identifiers before entering the speed aggregation pipeline. Individual traces cannot be reconstructed.
- Differential privacy — Noise is added to aggregate statistics to prevent deanonymization. A road segment's speed is only published if ≥5 distinct devices contributed observations.
- k-anonymity — Locations are generalized to prevent identification of individuals in sparse areas.
- Data retention — Raw location events are deleted within hours; only aggregate speeds are retained.
Geocoding & Reverse Geocoding
Geocoding converts a human-readable address ("1600 Amphitheatre Parkway, Mountain View, CA") into geographic coordinates (37.4220, -122.0841). Reverse geocoding does the opposite: given coordinates, return the nearest address.
Geocoding Pipeline
- Address parsing — Decompose the input string into structured components: house number, street, city, state, country. NLP-based parsers handle variations like "St" vs "Street", abbreviations, and international formats.
- Candidate generation — Query the address database for matching street names in the likely city/region. Use fuzzy matching (edit distance, phonetic matching) to handle typos.
- Interpolation — If the exact house number isn't in the database, interpolate its position between known addresses on the same block. Most countries don't have per-building coordinate databases, so interpolation is essential.
- Scoring & ranking — Rank candidates by match quality, considering exact match bonuses, partial match penalties, and geographic relevance (prefer results near the user).
Reverse Geocoding
Given coordinates (lat, lng), find the nearest address:
- Use a spatial index (R-tree / S2 cells) to find nearby road segments
- Project the point onto the nearest road segment to find the exact position along the road
- Interpolate the address number based on the position between known address points
- Compose the full address from the road name, interpolated number, and administrative boundary (city, state, zip)
POI Search
When a user searches "coffee shops near me," Maps must find relevant points of interest within a geographic region, ranked by relevance, distance, and quality.
Geospatial Index
The POI database (hundreds of millions of businesses, landmarks, and addresses) is indexed using geospatial data structures:
- S2 cell index — Each POI is assigned to the finest-level S2 cell containing its coordinates. Nearby POI queries become range scans over contiguous cell ID ranges.
- Geohash prefix matching — Alternative approach where locations are encoded as hierarchical strings. "9q8yy" represents a ~4.9 km × 4.9 km area in San Francisco. Prefix matching finds all POIs in a region.
- R-tree — A balanced tree of bounding boxes used for range and nearest-neighbor queries. Each internal node stores the bounding box enclosing all children. Queries traverse only branches whose bounding boxes intersect the search area.
Ranking
POI search results are ranked by a combination of:
- Text relevance — How well the POI name/category matches the query (TF-IDF, BM25, or neural embeddings)
- Distance — Closer is better, with exponential decay
- Prominence — Based on review count, average rating, business size, and web presence
- Personalization — User's past searches, saved places, and preferences
- Recency — Newer, actively updated businesses rank higher
Offline Maps
Users can download map regions for offline use — critical for areas with poor connectivity and for international travelers avoiding roaming charges.
What Gets Downloaded
- Vector tiles — All tiles for the selected region at all zoom levels. For a city-sized region, this is typically 50–200 MB of vector tile data.
- Road graph — The routing graph (nodes, edges, weights) for the region, including precomputed Contraction Hierarchies shortcuts. ~20–50 MB for a metro area.
- POI data — Business names, addresses, categories, and review summaries. ~10–30 MB.
- Geocoding index — Address-to-coordinate mapping for the region. ~5–15 MB.
Download Packaging
The total download for a metro area (e.g., San Francisco) is typically 100–300 MB. Google uses aggressive compression:
- Delta encoding — Only download tiles that have changed since the last sync
- Protobuf encoding — Vector tiles use Mapbox Vector Tile (MVT) format with protocol buffer serialization
- Graph compression — The routing graph uses variable-length integer encoding and removes attributes unnecessary for offline routing
Offline maps auto-expire after ~30 days and prompt the user to re-download, ensuring data freshness. When a network connection becomes available, the app seamlessly switches back to online data, which includes live traffic and real-time search results.
System Architecture Overview
Putting it all together, the major components and their interactions:
┌──────────────────────────────────────────────────────────────┐
│ CLIENT (Mobile/Web) │
│ ┌─────────┐ ┌─────────┐ ┌──────────┐ ┌──────────────────┐ │
│ │ Map │ │ Route │ │ Search │ │ Location │ │
│ │ Renderer│ │ Display │ │ UI │ │ Reporter (opt-in)│ │
│ └────┬────┘ └────┬────┘ └────┬─────┘ └────────┬─────────┘ │
└───────┼──────────┼─────────┼──────────────────┼──────────────┘
│ │ │ │
────┼──────────┼─────────┼──────────────────┼──── CDN Edge
│ │ │ │
┌───────▼──────────▼─────────▼──────────────────▼──────────────┐
│ BACKEND SERVICES │
│ │
│ ┌────────────┐ ┌────────────┐ ┌──────────────────────┐ │
│ │ Tile │ │ Routing │ │ Traffic Aggregation │ │
│ │ Server │ │ Engine │ │ Pipeline │ │
│ │ (CDN- │ │ (CH + │ │ (Map Matching → │ │
│ │ backed) │ │ CCH + │ │ Speed Calc → │ │
│ │ │ │ live │ │ Stream Agg) │ │
│ │ │ │ traffic) │ │ │ │
│ └─────┬─────┘ └─────┬─────┘ └──────────┬───────────┘ │
│ │ │ │ │
│ ┌─────▼─────┐ ┌─────▼─────┐ ┌──────────▼───────────┐ │
│ │ Tile │ │ Road │ │ Speed Table │ │
│ │ Storage │ │ Graph │ │ (per-segment live) │ │
│ │ (Bigtable)│ │ Store │ │ + Historical Profiles│ │
│ └───────────┘ └───────────┘ └──────────────────────┘ │
│ │
│ ┌────────────┐ ┌────────────┐ ┌──────────────────────┐ │
│ │ Geocoding │ │ POI Search │ │ ETA ML Model │ │
│ │ Service │ │ (Geo Index │ │ (GNN / Transformer) │ │
│ │ │ │ + Ranking)│ │ │ │
│ └────────────┘ └────────────┘ └──────────────────────┘ │
└───────────────────────────────────────────────────────────────┘
Deep Dive: Tile Math
Let's derive the exact formulas behind the tile coordinate system, which underpins all of map rendering.
Web Mercator Projection
The Mercator projection maps latitude φ and longitude λ to Cartesian coordinates (x, y) on a unit square [0, 1] × [0, 1]:
x = (λ + 180°) / 360°
y = (1 - ln(tan(φ) + sec(φ)) / π) / 2
Inverse:
λ = x × 360° - 180°
φ = arctan(sinh(π(1 - 2y))) × 180° / π
The Mercator projection clips latitude at ±85.0511° (where y would reach ±∞) to keep the map square. This excludes the polar regions, which is acceptable for a navigation-focused map.
Pixel ↔ Tile Conversion
At zoom level z, the map is 256 × 2z pixels wide and tall. The pixel coordinates of a geographic point are:
pixelX = floor(x × 256 × 2^z)
pixelY = floor(y × 256 × 2^z)
tileX = floor(pixelX / 256) = floor(x × 2^z)
tileY = floor(pixelY / 256) = floor(y × 2^z)
// Position within tile (0-255):
inTileX = pixelX mod 256
inTileY = pixelY mod 256
Ground Resolution
The ground resolution (meters per pixel) varies by latitude due to the Mercator projection:
resolution(lat, z) = cos(lat × π/180) × 2π × 6378137 / (256 × 2^z)
At the equator (lat=0):
z=0: 156,543 m/px (entire world in 256 px)
z=10: 152.9 m/px (city-level)
z=15: 4.78 m/px (street-level)
z=21: 0.075 m/px (individual features)
At London (lat=51.5):
z=15: 2.98 m/px (37% less than equator)
Tile Budget Per View
A typical phone screen (412×892 pixels) at zoom level 15 needs approximately:
horizontal tiles = ceil(412 / 256) + 1 = 3
vertical tiles = ceil(892 / 256) + 1 = 5
total tiles = 3 × 5 = 15 tiles per view
With 256×256 px tiles, a pan/zoom session loads:
~15 tiles for initial view
~5-10 tiles per pan gesture
~15 new tiles per zoom level change
~50-100 tiles for a typical browsing session
Deep Dive: Traffic Data Processing
Processing 50–100 million GPS events per second into accurate per-road-segment speeds is a formidable stream processing challenge.
Map Matching: Hidden Markov Model
The HMM-based map matching algorithm works as follows:
States: Road segments within 50m of each GPS point
Observations: GPS points (lat, lng, timestamp)
Emission probability:
P(obs | state) = (1/√(2πσ²)) × exp(-d² / (2σ²))
where d = distance from GPS point to road segment
and σ ≈ 10m (GPS noise standard deviation)
Transition probability:
P(state_t | state_{t-1}) ∝ exp(-|d_route - d_great_circle| / β)
where d_route = shortest path distance between road segments
and d_great_circle = great-circle distance between GPS points
and β ≈ 5m (tuning parameter)
The Viterbi algorithm finds the most likely road segment sequence
in O(T × K²) time, where T = number of GPS points, K = avg candidates
Speed Computation per Segment
For each road segment in a 60-second window:
1. Collect all device traversals:
[{device_hash, entry_time, exit_time, distance}]
2. Compute per-device speeds:
speed_i = distance / (exit_time - entry_time)
3. Outlier removal:
- Remove if speed > speed_limit × 1.5 (GPS error)
- Remove if speed < 2 km/h and segment length > 100m (stopped/parking)
4. Aggregate:
- If sample_count ≥ 5: segment_speed = median(speeds)
- If sample_count < 5: blend with historical:
segment_speed = α × median(speeds) + (1-α) × historical_speed
where α = min(sample_count / 5, 1.0)
5. Confidence score:
confidence = min(sample_count / 20, 1.0)
(used to determine whether to show traffic color)
Incident Detection
The traffic pipeline also detects incidents (accidents, road closures) through anomaly detection:
- Sudden speed drops — If median speed drops below 20% of historical average for ≥2 consecutive windows → flag as incident
- Zero throughput — If a normally busy segment has zero observations for ≥5 minutes → potential road closure
- Corroboration — User reports ("accident ahead" button), Waze reports, and emergency service feeds are cross-referenced with telemetry-detected anomalies
- Propagation modeling — When an incident is confirmed, the system predicts downstream congestion propagation and pre-emptively adjusts ETAs for routes passing through the area
Summary & Key Takeaways
| Component | Key Technology | Scale Challenge |
|---|---|---|
| Map Tiles | Tile pyramid (z/x/y), vector tiles, CDN | ~200K QPS, petabytes of storage |
| Routing | Contraction Hierarchies + live traffic | 500M nodes, sub-millisecond queries |
| ETA | GNN/Transformer ML + historical + live | ±10% accuracy, 30s freshness |
| Live Traffic | HMM map matching + stream aggregation | 100M events/sec, 1B road segments |
| Geocoding | NLP parsing + address interpolation | Global address coverage |
| POI Search | S2 geospatial index + ML ranking | Hundreds of millions of POIs |
| Offline Maps | Compressed tiles + graph + POI bundle | 100–300 MB per metro area |