CDN & Edge Computing
What Is a CDN?
A Content Delivery Network (CDN) is a geographically distributed system of servers that delivers web content to users from the server nearest to them. The core insight is simple: the speed of light is fast, but the internet is slow. A packet travelling from New York to Singapore (~15,000 km) needs at least 50 ms one way just at light speed through fiber — and real-world latency is typically 3-4x worse due to routing hops, congestion, and protocol overhead.
Consider a user in Tokyo requesting an image from an origin server in Virginia:
- Without CDN: Request travels ~11,000 km each way. Round-trip time ≈
180-250 ms. With TCP handshake (1 RTT), TLS (1-2 RTTs), and HTTP request/response (1 RTT), total time before first byte: 540-1000 ms. - With CDN: Request goes to Tokyo PoP (~20 km away). Round-trip time ≈
5-15 ms. Total time to first byte: 15-45 ms — a 10-60x improvement.
CDNs solve three fundamental problems:
- Latency: Serve content from geographically close servers, reducing round-trip time from hundreds of milliseconds to single digits.
- Bandwidth: Distribute load across thousands of edge servers instead of concentrating it on a single origin. A viral video that gets 10M views in an hour would crush most origin servers — but CDN edge servers absorb the traffic.
- Reliability: If one PoP goes down, DNS automatically routes users to the next closest PoP. The origin can go offline temporarily and cached content continues to be served.
Modern CDNs serve a staggering volume of traffic. Cloudflare handles over 57 million HTTP requests per second on average. Akamai delivers between 15-30% of all web traffic globally. Netflix's Open Connect CDN serves over 100 Tbps during peak hours.
How CDNs Work
The CDN request lifecycle involves several key components working together:
Core Architecture
- Origin Server: Your actual web server (e.g., an EC2 instance, a Kubernetes cluster) that holds the authoritative copy of all content. The CDN treats this as the "source of truth."
- PoP (Point of Presence): A physical data center location containing CDN edge servers. Major CDNs have 200-300+ PoPs worldwide. Each PoP typically contains 10-1000+ servers depending on the provider and traffic demands.
- Edge Server: An individual server within a PoP that caches and serves content. These are optimized for high throughput (often 10-40 Gbps per server) with large SSD caches (1-8 TB typical).
Request Flow: Step by Step
- DNS Resolution: User requests
cdn.example.com. The authoritative DNS for this domain returns a CNAME pointing to the CDN's domain (e.g.,d1234.cloudfront.net). The CDN's DNS then uses Anycast routing or geo-DNS to resolve to the IP of the nearest PoP. - TCP/TLS Connection: The user's browser establishes a TCP connection to the edge server. Since the PoP is nearby, the TCP handshake completes in ~5 ms instead of ~150 ms. TLS session resumption further reduces overhead on repeat connections.
- Cache Lookup: The edge server hashes the request URL + relevant headers (e.g.,
Accept-Encoding,Accept-Language) and checks its local cache. This is typically an in-memory lookup (microseconds) backed by SSD storage. - Cache HIT: Content is found and fresh → served immediately. Response time: 1-10 ms.
- Cache MISS: Content not cached or stale → edge server fetches from origin (or a mid-tier cache). The content is then stored locally for future requests.
Anycast Routing
Most modern CDNs use BGP Anycast — the same IP address is announced from multiple PoPs worldwide. When a user connects to that IP, BGP routing naturally directs the connection to the nearest PoP (in terms of network hops, not necessarily geographic distance). This is elegant because:
- No special DNS infrastructure needed for routing decisions
- Automatic failover — if a PoP goes offline, BGP withdraws the route and traffic shifts to the next closest PoP within seconds
- DDoS resilience — attack traffic is naturally distributed across all PoPs
# Simplified DNS resolution flow
$ dig cdn.example.com
;; ANSWER SECTION:
cdn.example.com. 300 IN CNAME d1234.cloudfront.net.
d1234.cloudfront.net. 60 IN A 13.224.64.12 ← Anycast IP
d1234.cloudfront.net. 60 IN A 13.224.64.44 ← Anycast IP
# The returned IPs resolve to the nearest PoP
# A user in London gets routed to London PoP
# A user in Tokyo gets routed to Tokyo PoP
# Same IPs, different physical destinations
▶ CDN Request Flow
Step through the CDN request lifecycle: DNS resolution, cache HIT, cache MISS, and origin fetch.
Push vs Pull CDN
CDNs use two fundamental strategies for populating their edge caches:
Pull CDN (Lazy / On-Demand)
The edge server fetches content from the origin only when a user requests it. This is by far the most common approach.
- First request: Cache MISS → fetch from origin → cache locally → serve
- Subsequent requests: Cache HIT → serve directly from edge
- Expiration: Content is evicted based on TTL (Time To Live) or LRU (Least Recently Used) when cache is full
# Pull CDN: Origin server sends cache headers
HTTP/1.1 200 OK
Cache-Control: public, max-age=86400, s-maxage=604800
Content-Type: image/jpeg
ETag: "a1b2c3d4"
Content-Length: 245832
# max-age=86400 → Browser caches for 1 day
# s-maxage=604800 → CDN caches for 7 days
# ETag → Enables conditional revalidation
Push CDN (Proactive / Pre-populated)
You explicitly upload content to the CDN before any user requests it. Think of it as a distributed file system you populate ahead of time.
- Upload: You push files to the CDN via API or CLI
- Every request: Served from edge (no origin fetch needed)
- Updates: You explicitly push new versions or invalidate old ones
# Push CDN: Upload via AWS CLI to S3 (CloudFront origin)
aws s3 sync ./dist s3://my-cdn-bucket/ \
--cache-control "public, max-age=31536000, immutable" \
--delete
# Or via Cloudflare API
curl -X PUT "https://api.cloudflare.com/client/v4/accounts/{id}/storage/kv/namespaces/{ns}/values/{key}" \
-H "Authorization: Bearer {token}" \
--data-binary @./asset.js
Comparison
| Aspect | Pull CDN | Push CDN |
|---|---|---|
| Cache population | Automatic on first request | Manual upload required |
| First request latency | Higher (cache MISS → origin fetch) | Low (already cached) |
| Storage efficiency | Only caches requested content | All content stored (may waste space) |
| Origin dependency | Origin must be available for MISSes | Origin can be offline after push |
| Best for | Large catalogs, dynamic content, APIs | Static assets, known file sets, SPAs |
| Update complexity | Automatic via TTL/revalidation | Requires explicit push + invalidation |
| Real-world examples | Cloudflare, Fastly, most CDNs by default | S3 + CloudFront, Netlify, Vercel |
app.a1b2c3.js). Dynamic content (API responses, HTML pages) uses pull CDN with shorter TTLs.
Cache Hierarchy
Modern CDNs use a multi-tier cache hierarchy to maximize cache hit rates while minimizing origin load:
Three-Tier Architecture
- L1 — Edge Cache (PoP): Closest to the user. Each of the 200-300 PoPs has its own cache. Small, fast, high turnover. Cache hit ratio: 80-90% for popular content.
- L2 — Shield / Mid-Tier Cache: A designated "super PoP" that aggregates requests from multiple edge PoPs in a region. If an edge cache misses, it queries the shield before going to origin. This dramatically reduces origin load: instead of 200 PoPs hitting origin, only 5-10 shield nodes do. Cache hit ratio at shield: 95-99%.
- L3 — Origin Server: The authoritative source. Only receives requests that miss both edge and shield caches — typically 1-5% of total traffic.
# Request flow through cache hierarchy:
User → Edge PoP (Tokyo)
├── HIT → Serve (5ms)
└── MISS → Shield (Singapore)
├── HIT → Serve + cache at Tokyo edge (25ms)
└── MISS → Origin (Virginia)
└── Serve + cache at Shield + cache at Edge (180ms)
# Without shield: 200 PoPs each send cache-miss requests to origin
# With shield: Only 5-10 shield nodes send requests to origin
# Origin load reduction: ~95%
Cache-Control Headers
HTTP cache headers are the primary mechanism for controlling CDN caching behavior:
| Header / Directive | Meaning | Example |
|---|---|---|
max-age=N | Browser can cache for N seconds | max-age=3600 (1 hour) |
s-maxage=N | CDN/shared cache TTL (overrides max-age) | s-maxage=86400 (1 day) |
no-cache | Must revalidate with origin before serving | Forces ETag/Last-Modified check |
no-store | Never cache (sensitive data) | Banking, personal data |
stale-while-revalidate=N | Serve stale content while fetching fresh copy in background | stale-while-revalidate=60 |
stale-if-error=N | Serve stale if origin is down | stale-if-error=86400 |
immutable | Content will never change (skip revalidation) | Versioned assets: app.a3f2.js |
private | Only browser may cache (not CDN) | User-specific responses |
Conditional Requests & Revalidation
# Step 1: First request — origin includes ETag
HTTP/1.1 200 OK
Cache-Control: public, max-age=300, s-maxage=3600
ETag: "v42-a1b2c3"
Last-Modified: Wed, 15 Apr 2026 10:30:00 GMT
Content-Type: application/json
{"products": [...]}
# Step 2: After TTL expires, CDN revalidates
GET /api/products HTTP/1.1
If-None-Match: "v42-a1b2c3"
If-Modified-Since: Wed, 15 Apr 2026 10:30:00 GMT
# Step 3a: Content unchanged → 304 (no body transferred!)
HTTP/1.1 304 Not Modified
ETag: "v42-a1b2c3"
Cache-Control: public, max-age=300, s-maxage=3600
# Step 3b: Content changed → 200 with new body
HTTP/1.1 200 OK
ETag: "v43-d4e5f6"
Cache-Control: public, max-age=300, s-maxage=3600
{"products": [... updated data ...]}
The Vary Header
The Vary header tells the CDN to cache separate versions based on request header values:
# Cache different versions based on encoding and language
Vary: Accept-Encoding, Accept-Language
# This means the CDN stores separate cached copies for:
# Accept-Encoding: gzip + Accept-Language: en-US → Version A
# Accept-Encoding: br + Accept-Language: en-US → Version B
# Accept-Encoding: gzip + Accept-Language: ja → Version C
# WARNING: Vary on too many headers = cache fragmentation = low hit ratio
# NEVER: Vary: Cookie (each user gets their own cached copy = no caching)
CDN Invalidation
Phil Karlton famously said, "There are only two hard things in Computer Science: cache invalidation and naming things." CDN invalidation at global scale is especially hard because you're invalidating caches across hundreds of PoPs worldwide simultaneously.
Invalidation Strategies
1. TTL-Based Expiration (Passive)
Content naturally expires after its TTL. Simple but imprecise — content remains stale until the TTL expires.
# Short TTL for frequently changing content
Cache-Control: public, s-maxage=60 # CDN caches for 1 minute
# Long TTL for stable content
Cache-Control: public, s-maxage=604800 # CDN caches for 7 days
2. Purge (Hard Invalidation)
Immediately remove content from all edge caches. The CDN sends purge commands to every PoP.
# Cloudflare: Purge specific URLs
curl -X POST "https://api.cloudflare.com/client/v4/zones/{zone_id}/purge_cache" \
-H "Authorization: Bearer {token}" \
-d '{"files": ["https://example.com/api/products", "https://example.com/images/hero.jpg"]}'
# Fastly: Instant purge via surrogate key
curl -X POST "https://api.fastly.com/service/{id}/purge/{surrogate-key}" \
-H "Fastly-Key: {token}"
# AWS CloudFront: Create invalidation
aws cloudfront create-invalidation \
--distribution-id E1234 \
--paths "/api/products" "/images/*"
3. Soft Purge (Mark as Stale)
Instead of removing content, mark it as stale. The next request triggers a background revalidation while serving the stale content. This avoids the "thundering herd" problem where a hard purge causes all users to simultaneously hit origin.
# Fastly soft purge — marks content stale, serves stale while revalidating
curl -X POST "https://api.fastly.com/service/{id}/purge/{key}" \
-H "Fastly-Key: {token}" \
-H "Fastly-Soft-Purge: 1"
4. Versioned URLs (Best Practice)
The most reliable invalidation strategy: never invalidate — just change the URL. Content-hash filenames ensure that new deployments automatically bypass old cached versions.
# Build tool generates content-hashed filenames:
app.a1b2c3d4.js ← Old version (cached for 1 year)
app.e5f6g7h8.js ← New version (different URL = no cache conflict)
# index.html (short TTL) references the new filename:
<script src="/assets/app.e5f6g7h8.js"></script>
# Cache headers for versioned assets:
Cache-Control: public, max-age=31536000, immutable
# ↑ Cache for 1 year; content will never change at this URL
stale-while-revalidate for API responses — users get instant responses while background refresh happens. (3) Use soft purge for urgent updates — avoids thundering herd. (4) Use hard purge as last resort — for security-critical content removal.
Propagation Delay
Even "instant" purges take time to propagate across all PoPs:
- Cloudflare: < 30 ms globally (extremely fast due to single-tier architecture)
- Fastly: ~150 ms globally (Instant Purge via streaming miss)
- AWS CloudFront: Up to 60 seconds (invalidation propagates asynchronously to 400+ PoPs)
- Akamai: 5-7 seconds typical (Fast Purge API)
Edge Computing
Edge computing extends CDNs beyond static content caching — it runs custom application logic at the edge, closer to users. Instead of every request travelling to a central origin, computation happens at the nearest PoP.
Edge Compute Platforms
| Platform | Runtime | Cold Start | Max Execution | Locations |
|---|---|---|---|---|
| Cloudflare Workers | V8 Isolates (JS/WASM) | 0 ms (no cold start) | 30s CPU / 30s wall | 310+ |
| Lambda@Edge | Node.js, Python | 50-200 ms | 5-30s | 220+ (CloudFront PoPs) |
| CloudFront Functions | JavaScript (lightweight) | <1 ms | 1ms CPU | 220+ |
| Deno Deploy | V8 Isolates (JS/TS/WASM) | 0 ms | 50ms CPU / 2min wall | 35+ |
| Vercel Edge Functions | V8 Isolates (JS/TS) | 0 ms | 30s | ~30+ |
Use Cases
A/B Testing at the Edge
// Cloudflare Worker: A/B testing without origin involvement
export default {
async fetch(request) {
const cookie = request.headers.get('Cookie') || '';
let variant = cookie.match(/ab_variant=(\w+)/)?.[1];
if (!variant) {
variant = Math.random() < 0.5 ? 'control' : 'experiment';
}
const url = new URL(request.url);
url.pathname = `/${variant}${url.pathname}`;
const response = await fetch(url.toString());
const newResponse = new Response(response.body, response);
newResponse.headers.set('Set-Cookie',
`ab_variant=${variant}; Path=/; Max-Age=86400`);
return newResponse;
}
};
Geolocation-Based Routing
// Route users to region-specific content
export default {
async fetch(request) {
const country = request.headers.get('CF-IPCountry');
const lang = { 'JP': 'ja', 'DE': 'de', 'FR': 'fr', 'BR': 'pt' }[country] || 'en';
// Redirect to localized version
if (lang !== 'en') {
return Response.redirect(`https://example.com/${lang}${new URL(request.url).pathname}`, 302);
}
return fetch(request);
}
};
Authentication & Token Validation
// Validate JWT at the edge — reject unauthorized requests before they hit origin
export default {
async fetch(request) {
const token = request.headers.get('Authorization')?.replace('Bearer ', '');
if (!token) return new Response('Unauthorized', { status: 401 });
try {
const payload = await verifyJWT(token, JWT_SECRET);
const newHeaders = new Headers(request.headers);
newHeaders.set('X-User-Id', payload.sub);
return fetch(new Request(request, { headers: newHeaders }));
} catch (e) {
return new Response('Invalid token', { status: 403 });
}
}
};
Image Optimization
// Dynamically resize and convert images at the edge
export default {
async fetch(request) {
const url = new URL(request.url);
const width = parseInt(url.searchParams.get('w') || '800');
const format = request.headers.get('Accept')?.includes('webp') ? 'webp' : 'jpeg';
return fetch(request, {
cf: {
image: {
width: Math.min(width, 2000),
format: format,
quality: 85,
fit: 'cover'
}
}
});
}
};
CDN for Dynamic Content
CDNs are no longer just for static files. Modern CDNs accelerate dynamic content through several techniques:
Edge Side Includes (ESI)
ESI lets you cache page fragments independently. A product page might be 90% static (layout, images, description) and 10% dynamic (price, stock, personalized recommendations). ESI caches the static parts and fetches only the dynamic fragments from origin.
<!-- Cached page template (TTL: 1 hour) -->
<html>
<body>
<header>...static nav...</header>
<main>
<h1>Product XYZ</h1>
<img src="/img/xyz.jpg" />
<!-- Dynamic fragment: fetched from origin on each request -->
<esi:include src="/api/price?sku=xyz" ttl="60" />
<!-- Personalized fragment -->
<esi:include src="/api/recommendations?user=${user_id}" />
</main>
</body>
</html>
API Response Caching
Short-TTL caching of API responses at the edge can handle massive read loads:
# API endpoint with aggressive edge caching
# Even 5 seconds of caching absorbs massive traffic spikes
GET /api/products/trending HTTP/1.1
HTTP/1.1 200 OK
Cache-Control: public, s-maxage=5, stale-while-revalidate=30
Surrogate-Key: products trending
Vary: Accept-Encoding
# At 10,000 requests/second:
# Without CDN: 10,000 req/s hit origin
# With 5s TTL: ~1 req/5s hits origin (0.002% of traffic)
# Origin load reduction: 99.998%
Dynamic Site Acceleration (DSA)
Even for truly uncacheable dynamic requests, CDNs improve performance through:
- Persistent connections: CDN maintains keep-alive connections to origin (eliminates TCP/TLS handshake per request)
- Route optimization: CDN uses private backbone networks between PoPs and origin (avoids congested public internet)
- Protocol optimization: HTTP/2 multiplexing between edge and origin, TCP congestion window tuning
- Connection coalescing: Multiple client requests at the same PoP are multiplexed over a single connection to origin
WebSocket & Real-Time Content
Modern CDNs support WebSocket proxying and real-time protocols:
# Cloudflare automatically proxies WebSocket connections
# User ↔ Edge PoP ↔ Origin (persistent WebSocket connection)
# Benefits:
# - TLS termination at the edge (faster handshake)
# - DDoS protection for WebSocket connections
# - Geographic load balancing across multiple origin servers
# - Connection pooling between PoP and origin
# Limitations:
# - Each WebSocket connection still routes to origin (not cached)
# - Edge can inspect/route but not cache real-time data
# - Added hop may increase latency by 1-5 ms
▶ Global CDN Map
See how users worldwide route to their nearest PoP. Step through to watch geographic routing in action.
CDN Providers
The CDN landscape is diverse, with providers optimized for different use cases:
| Provider | PoPs | Edge Compute | Purge Speed | Best For |
|---|---|---|---|---|
| Cloudflare | 310+ | Workers (V8) | <30 ms | All-in-one (CDN + WAF + DDoS + edge compute). Generous free tier. |
| Akamai | 4,100+ | EdgeWorkers | 5-7s | Enterprise, largest network. Media streaming at scale. |
| AWS CloudFront | 400+ | Lambda@Edge / CF Functions | ~60s | AWS-native apps. Deep integration with S3, ALB, API Gateway. |
| Fastly | 90+ | Compute@Edge (WASM) | ~150 ms | Real-time purging, VCL configurability. API-heavy workloads. |
| Google Cloud CDN | 180+ | Cloud Run (regional) | ~seconds | GCP-native apps. Tight integration with Cloud Load Balancing. |
CDN in System Design Interviews
CDN is one of the most frequently mentioned components in system design interviews. Here's how to incorporate it effectively:
When to Mention CDN
- Always mention CDN when the system serves static assets (images, JS, CSS, videos) to geographically distributed users.
- Mention for read-heavy APIs — even short-TTL caching at the edge can absorb 99%+ of read traffic.
- Mention for media streaming — video, audio, and large file downloads benefit enormously from edge caching.
- Mention for global services — any system with users across multiple continents needs a CDN for acceptable latency.
How to Draw CDN in Architecture Diagrams
┌─────────────┐
Users ──────────▶│ CDN │
(Global) │ (Edge PoPs) │
└──────┬──────┘
│ Cache MISS only
┌──────▼──────┐
│ Load Balancer│
└──────┬──────┘
│
┌────────────┼────────────┐
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ App Srv 1│ │ App Srv 2│ │ App Srv 3│
└──────────┘ └──────────┘ └──────────┘
Place CDN as the FIRST layer between users and your infrastructure.
Show that only cache MISSes reach your backend.
Common Interview Follow-Up Questions
| Question | Key Points to Cover |
|---|---|
| "How do you handle cache invalidation?" | Versioned URLs for assets, TTL + stale-while-revalidate for APIs, surrogate keys for targeted purging. |
| "What if the CDN serves stale data?" | Acceptable for most read traffic (eventual consistency). Critical data (payments, auth) bypasses CDN. |
| "How does CDN handle personalized content?" | ESI for partial caching, edge compute for personalization logic, Vary header for limited variants. |
| "CDN for write-heavy workloads?" | CDN primarily helps reads. For writes, consider edge compute for validation, then forward to origin. |
| "What about CDN costs?" | CDN bandwidth is cheaper than origin bandwidth. Typical: $0.01-0.08/GB at CDN vs $0.09-0.12/GB at cloud origin. |
Sample Answer: "Design a News Feed"
# CDN-relevant portion of a news feed design:
1. Static assets (JS, CSS, images, video thumbnails)
→ Push CDN with immutable versioned URLs
→ Cache-Control: public, max-age=31536000, immutable
2. Feed API responses (per-user feed)
→ Cannot cache personalized feeds at CDN
→ BUT: Cache trending/popular stories at edge (s-maxage=30)
→ Cache user profile pictures at edge (s-maxage=3600)
→ Use edge compute for A/B testing feed algorithms
3. Real-time notifications
→ WebSocket through CDN for TLS termination + DDoS protection
→ No caching, but reduced handshake latency
4. Architecture:
Users → CDN → [static assets served from edge, 95% of bandwidth]
Users → CDN → API Gateway → Feed Service [only feed requests reach origin]