Design: URL Shortener (TinyURL)
Problem Statement
A URL shortener is a service that takes a long, unwieldy URL and maps it to a short, easy-to-share alias. When a user visits the short URL, the service redirects them to the original long URL. You've used these every day — bit.ly, tinyurl.com, t.co (Twitter/X), and goo.gl (now deprecated) are all URL shorteners.
For example:
Long URL: https://www.example.com/articles/2026/04/a-very-long-title-about-system-design?ref=blog&utm_source=twitter Short URL: https://short.ly/aB3x7Kp
Why do companies build and use URL shorteners?
- Shareability — Short links are easier to paste in tweets, SMS, presentations, and print media where space is limited.
- Analytics — Every redirect can be logged: who clicked, when, from where, what device. This data is enormously valuable for marketing.
- Branding — Custom short domains (e.g.,
amzn.to,youtu.be) reinforce brand identity. - Link management — The underlying long URL can be changed without breaking the short link (useful for A/B testing, campaign rotation).
- Obfuscation — Affiliate or tracking parameters are hidden from the user behind a clean short URL.
Despite appearing simple on the surface, a URL shortener at scale is a rich system design problem. It touches key generation, database design, caching, load balancing, analytics pipelines, and abuse prevention — all topics that interviewers love to dig into.
Requirements
Functional Requirements
FR-2 · Redirect: When a user accesses the short URL, the service redirects them to the original long URL.
FR-3 · Custom Aliases (optional): Users can optionally provide a custom alias (e.g.,
short.ly/my-brand).FR-4 · Analytics: Track click counts, geographic data, referrer, device type per short URL.
FR-5 · Expiration: URLs can have an optional expiry time, after which the short link becomes inactive.
Non-Functional Requirements
NFR-2 · Low Latency: Redirects must complete in <100 ms. Users expect instant navigation — any delay and they'll assume the link is broken.
NFR-3 · Scalability: Handle 100 million new URLs per day and billions of redirects. The system must scale horizontally.
NFR-4 · Durability: Once a URL mapping is stored, it must never be lost. This is effectively a permanent record.
NFR-5 · Uniqueness: Every generated short code must be unique — no two long URLs should ever map to the same short code (no collisions).
Out of Scope
- User authentication/authorization (assume API key–based access)
- UI/frontend implementation
- Payment/billing for premium plans
Back-of-Envelope Estimation
Before designing anything, we need to understand the scale. Let's work through the numbers carefully.
Traffic Estimates
| Metric | Calculation | Result |
|---|---|---|
| New URLs / day | (given) | 100 M |
| Write QPS | 100M / 86,400 sec | ~1,160 writes/sec |
| Read:Write ratio | Assume 10:1 (read-heavy) | 10:1 |
| Redirects / day | 100M × 10 | 1 Billion |
| Read QPS | 1B / 86,400 sec | ~11,600 reads/sec |
At peak (assuming 3× average): ~3,500 writes/sec and ~35,000 reads/sec. This is well within the range of a horizontally-scaled web service.
Storage Estimates
| Metric | Calculation | Result |
|---|---|---|
| Average URL record size | shortCode (7B) + longUrl (avg 200B) + metadata (~293B) | ~500 bytes |
| Storage / day | 100M × 500B | ~50 GB / day |
| Storage / year | 50 GB × 365 | ~18 TB / year |
| Storage / 5 years | 18 TB × 5 | ~90 TB total |
Cache Estimates
Following the 80/20 rule (Pareto principle): 20% of URLs generate 80% of traffic. We should cache the hottest 20% of URLs.
| Metric | Calculation | Result |
|---|---|---|
| Requests / day to cache | 11,600 QPS × 86,400 sec × 20% | ~200 M requests |
| Unique URLs to cache (daily) | 100M × 20% | 20 M URLs |
| Cache memory | 20M × 500B | ~10 GB |
10 GB fits easily in a single Redis instance (typical production Redis has 64–256 GB). We can use a small Redis cluster for redundancy.
Short Code Length
We need enough unique codes for 5 years of URLs:
Total URLs in 5 years = 100M × 365 × 5 = 182.5 billion Base62 characters: a-z (26) + A-Z (26) + 0-9 (10) = 62 6 characters: 62⁶ = 56.8 billion — tight, might run out 7 characters: 62⁷ = 3,521.6 billion — 19× headroom ✓ → Use 7 characters in Base62 encoding
API Design
We expose a RESTful API. All endpoints require an API key for rate limiting and abuse prevention.
1. Create Short URL
POST /api/v1/shorten
Headers:
Authorization: Bearer <api_key>
Content-Type: application/json
Request Body:
{
"longUrl": "https://example.com/very/long/path?q=system+design",
"customAlias": "my-link", // optional
"expireAt": "2027-01-01T00:00Z" // optional, ISO 8601
}
Response: 201 Created
{
"shortUrl": "https://short.ly/aB3x7Kp",
"shortCode": "aB3x7Kp",
"longUrl": "https://example.com/very/long/path?q=system+design",
"createdAt": "2026-04-15T10:30:00Z",
"expireAt": "2027-01-01T00:00:00Z"
}
Rate Limit Headers:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 97
X-RateLimit-Reset: 1713178800
2. Redirect (the core operation)
GET /{shortCode}
Example: GET /aB3x7Kp
Response: 301 Moved Permanently (or 302 Found)
Location: https://example.com/very/long/path?q=system+design
Error: 404 Not Found (if shortCode doesn't exist or is expired)
3. Get Analytics
GET /api/v1/stats/{shortCode}
Headers:
Authorization: Bearer <api_key>
Response: 200 OK
{
"shortCode": "aB3x7Kp",
"longUrl": "https://example.com/very/long/path?q=system+design",
"createdAt": "2026-04-15T10:30:00Z",
"totalClicks": 15482,
"uniqueClicks": 9273,
"clicksByCountry": { "US": 8201, "IN": 3104, "UK": 1422, ... },
"clicksByDevice": { "mobile": 9812, "desktop": 5100, "tablet": 570 },
"clicksByDay": [
{ "date": "2026-04-15", "clicks": 3201 },
{ "date": "2026-04-16", "clicks": 2847 },
...
]
}
4. Delete Short URL
DELETE /api/v1/urls/{shortCode}
Headers:
Authorization: Bearer <api_key>
Response: 204 No Content
Database Design
Core Tables
┌─────────────────────────────────────────┐
│ urls (primary table) │
├─────────────────────────────────────────┤
│ id BIGINT PK, auto-incr │
│ short_code VARCHAR(10) UNIQUE INDEX │
│ long_url VARCHAR(2048) NOT NULL │
│ user_id BIGINT FK → users.id │
│ created_at DATETIME NOT NULL │
│ expire_at DATETIME NULLABLE │
│ is_active BOOLEAN DEFAULT true │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ users │
├─────────────────────────────────────────┤
│ id BIGINT PK, auto-incr │
│ email VARCHAR(255) UNIQUE │
│ api_key VARCHAR(64) UNIQUE INDEX │
│ tier ENUM('free','pro','ent') │
│ created_at DATETIME NOT NULL │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ click_events (analytics) │
├─────────────────────────────────────────┤
│ id BIGINT PK, auto-incr │
│ short_code VARCHAR(10) INDEX │
│ timestamp DATETIME NOT NULL │
│ ip_address VARCHAR(45) │
│ user_agent VARCHAR(512) │
│ referrer VARCHAR(2048) │
│ country VARCHAR(2) │
│ device_type VARCHAR(20) │
└─────────────────────────────────────────┘
SQL vs NoSQL Decision
The core URL lookup is a simple key-value operation: given a short code, return the long URL. This makes NoSQL databases an excellent fit:
| Factor | SQL (PostgreSQL, MySQL) | NoSQL (DynamoDB, Cassandra) |
|---|---|---|
| Access pattern | Works, but B-tree overhead | Optimized for key-value lookup |
| Scale | Vertical + sharding (complex) | Horizontal scaling built-in |
| Consistency | Strong ACID guarantees | Eventual (tunable) |
| Schema flexibility | Rigid schema | Schema-less, easy evolution |
| Analytics queries | Rich SQL joins and aggregations | Limited (needs separate analytics DB) |
High-Level Architecture
Here is the top-level architecture for our URL shortener:
┌───────────────┐
│ CDN / Edge │
│ (CloudFront) │
└───────┬───────┘
│
┌───────▼───────┐
┌───────────────│ API Gateway │───────────────┐
│ │ (Rate Limiting)│ │
│ └───────┬───────┘ │
│ │ │
┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
│ App Server │ │ App Server │ │ App Server │
│ (Node) │ │ (Node) │ │ (Node) │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
└───────────┬───────────┘───────────┬───────────┘
│ │
┌───────▼───────┐ ┌──────▼──────┐
│ Redis Cache │ │ Message │
│ (Cluster) │ │ Queue │
└───────┬───────┘ │ (Kafka) │
│ └──────┬──────┘
┌───────▼───────┐ │
│ Database │ ┌──────▼──────┐
│ (DynamoDB / │ │ Analytics │
│ Cassandra) │ │ Service │
└───────────────┘ └──────┬──────┘
│
┌──────▼──────┐
│ ClickHouse / │
│ BigQuery │
└─────────────┘
Component responsibilities:
- CDN / Edge: Cache popular redirects at edge locations for sub-10ms response. Also serves static assets (docs, landing page).
- API Gateway: Rate limiting, authentication (API key validation), request routing, SSL termination.
- Application Servers: Stateless — handle URL shortening, redirect logic, and analytics event production. Horizontally scalable.
- Redis Cache: Stores hot URL mappings (shortCode → longUrl) for fast redirects. LRU eviction.
- Database: Persistent storage for all URL mappings. Cassandra or DynamoDB for horizontal scaling.
- Message Queue (Kafka): Decouples the redirect path from analytics. Click events are published to Kafka asynchronously so that redirect latency isn't affected.
- Analytics Service: Consumes click events from Kafka, enriches them (geo-IP lookup, device parsing), and writes to the analytics data warehouse.
Short Code Generation
This is the most critical design decision in a URL shortener. We need to generate a unique, short, URL-safe string for every new URL. There are three main approaches:
Approach 1: Base62 Encoding of Auto-Increment ID
Use a globally unique auto-incrementing counter. Convert the numeric ID to a Base62 string (characters: a-z A-Z 0-9).
ID = 123456789 Base62 encoding: 123456789 ÷ 62 = 1991239 remainder 51 → 'Z' 1991239 ÷ 62 = 32117 remainder 25 → 'z' 32117 ÷ 62 = 518 remainder 1 → 'b' 518 ÷ 62 = 8 remainder 22 → 'w' 8 ÷ 62 = 0 remainder 8 → 'i' Result: "iwbzZ" (read remainders bottom-up) Alphabet: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789
Pros: Guaranteed unique (IDs never repeat), no collision checking needed, deterministic. Cons: Sequential codes are predictable (users can enumerate), requires a centralized ID generator (single point of failure), codes grow in length as IDs increase.
Approach 2: Hash + Truncate
Apply a cryptographic hash (MD5 or SHA-256) to the long URL and take the first 7 characters of the Base62-encoded hash.
longUrl = "https://example.com/long/path" hash = MD5(longUrl) = "e4d909c290d0fb1ca068ffaddf22cbd0" shortCode = base62(hash[0:43bits]) = "aB3x7Kp" Collision? → Append user ID or timestamp, re-hash Still collision? → Append counter, re-hash
Pros: No centralized counter, same URL always generates the same short code (idempotent — useful for deduplication). Cons: Collision risk (birthday problem — with 182B URLs and 7-char codes, collisions become increasingly frequent), collision resolution adds complexity and latency.
Approach 3: Pre-Generated Key Generation Service (KGS) ✓
This is the recommended approach for production systems at scale. A dedicated Key Generation Service pre-generates random unique 7-character keys and stores them in a database.
- KGS pre-generates millions of random 7-character Base62 keys and stores them in a
keystable with two columns:keyandused(boolean). - When an application server needs keys, it requests a batch (e.g., 1,000 keys) from KGS. KGS marks those keys as
used = trueand hands them over. - The app server stores the batch in memory and allocates keys from it for incoming URL shortening requests.
- When the in-memory batch is exhausted, the app server requests a new batch from KGS.
- If an app server crashes, those unused in-memory keys are simply lost — an acceptable waste given the 3.5 trillion total possible keys.
Pros: Zero collisions (all keys are pre-generated and unique), no real-time hashing or collision checking, very fast (keys are pre-allocated in memory), horizontally scalable (each app server has its own batch). Cons: Slight key wastage on server crashes, requires maintaining the KGS service, initial key generation takes time.
Approach Comparison
| Criteria | Base62 + Auto-ID | Hash + Truncate | KGS (Recommended) |
|---|---|---|---|
| Collision risk | None ✓ | Medium ✗ | None ✓ |
| Predictability | Sequential ✗ | Random ✓ | Random ✓ |
| Single point of failure | ID generator ✗ | None ✓ | KGS (mitigated) ✓ |
| Latency | Low ✓ | Medium (hash + check) | Very low ✓ |
| Deduplication | No (same URL = different code) | Yes (same URL = same code) ✓ | No (same URL = different code) |
| Scalability | Limited by counter | Good | Excellent ✓ |
▶ URL Shortening Flow
Step through the full request flow when a new URL is shortened. Watch data move between components.
301 vs 302 Redirect
When a user accesses a short URL, the server responds with a redirect. The choice between HTTP 301 and 302 is more important than it appears:
| Aspect | 301 — Permanent Redirect | 302 — Temporary Redirect |
|---|---|---|
| Browser caching | Browser caches the redirect. Subsequent visits go directly to long URL. | Browser does NOT cache. Every visit hits your server. |
| Server load | Lower (browsers bypass server after first visit) | Higher (every click hits the server) |
| Analytics accuracy | Lower — cached redirects are invisible to the server | Higher — every click is logged ✓ |
| SEO | Passes link juice to the destination URL | Link juice stays with the short URL |
| URL updates | Dangerous — browsers won't check for updates | Safe — changes take effect immediately ✓ |
▶ Redirect Flow — Cache Hit vs Cache Miss
Compare the fast path (cache hit, ~5ms) vs slow path (cache miss, ~50ms) when a user clicks a short URL.
Caching Strategy
Caching is essential for a read-heavy service (10:1 read-write ratio). Our goal: serve as many redirects from cache as possible to keep p99 latency under 10ms.
Cache-Aside Pattern (Lazy Loading)
- Read path: App server checks Redis first. If cache hit, return immediately. If cache miss, query the database, store the result in Redis with a TTL, then return.
- Write path: On URL creation, write to DB first, then populate the cache proactively (write-through for newly created URLs since they are likely to be clicked soon).
- Delete/Update path: Invalidate the cache entry, then update the database. Next read will repopulate the cache from DB.
Eviction Policy: LRU
Least Recently Used (LRU) is the ideal eviction policy. URLs that haven't been accessed recently are evicted first, naturally keeping the hottest URLs in cache. Redis supports LRU natively via maxmemory-policy allkeys-lru.
Cache Configuration
Redis Configuration:
maxmemory: 10GB (stores ~20M URL records)
maxmemory-policy: allkeys-lru
TTL per entry: 24 hours (re-fetched from DB on next access)
Cache key format: url:{shortCode}
Cache value format: {longUrl, expireAt, isActive}
Cluster:
3 Redis nodes (1 primary + 2 replicas) for high availability
Consistent hashing across nodes for even distribution
Cache Warming
On a cold start (e.g., after Redis restart), the cache is empty and all requests hit the database. To mitigate this, we can implement cache warming: pre-load the top 1M most frequently accessed URLs from the analytics data into Redis during startup.
Scaling
Read Path Scaling
The redirect path is read-heavy (~12,000 QPS average, ~35,000 QPS peak). We scale reads through multiple layers:
- CDN / Edge caching: Popular short URLs can be cached at CDN edge locations for sub-5ms response globally.
- Redis cache cluster: 3+ Redis nodes with consistent hashing. Handles ~100K+ reads/sec.
- Database read replicas: For cache misses, read replicas distribute the database load. Eventual consistency is acceptable for URL lookups (a few-hundred-millisecond replication lag is fine).
- Horizontal app server scaling: Stateless app servers behind a load balancer. Auto-scale based on CPU/request count.
Database Sharding
With 90 TB over 5 years, a single database instance won't suffice. We shard by shortCode hash:
shard_id = hash(shortCode) % num_shards Example with 16 shards: shortCode "aB3x7Kp" → hash → 0x3A7F → 0x3A7F % 16 = 15 → Shard 15 shortCode "Xk9mLnQ" → hash → 0x1C42 → 0x1C42 % 16 = 2 → Shard 2 Each shard holds ~5.6 TB (90 TB / 16 shards) Each shard handles ~730 writes/sec (11,600 / 16)
We use consistent hashing for shard assignment so that adding/removing shards only requires redistributing a fraction of the data (1/N) rather than reshuffling everything.
Rate Limiting
To prevent abuse (spamming millions of URLs, scraping short codes), we implement rate limiting at multiple levels:
- Per API key: 100 URL creations per minute for free tier, 10,000 for enterprise.
- Per IP address: 50 URL creations per minute for unauthenticated requests.
- Global: Circuit breaker if total QPS exceeds 5× average (DDoS protection).
Implementation: Token bucket or sliding window counter in Redis. Each API key has a counter with a TTL.
Analytics Pipeline Scaling
Click events are the highest-volume data. We decouple analytics from the critical redirect path:
Redirect request
│
├──→ [1] Return 302 to user (fast path, <10ms)
│
└──→ [2] Produce click event to Kafka (async, fire-and-forget)
│
▼
Kafka Topic: "click-events"
(partitioned by shortCode for ordering)
│
▼
Spark Streaming / Flink Consumer
│
├──→ Real-time aggregation → Redis counters (total clicks)
└──→ Batch writes → ClickHouse (detailed analytics)
Deep Dive Topics
Custom Aliases
Users may want vanity URLs like short.ly/my-brand. Handling custom aliases requires:
- Uniqueness check: Before accepting a custom alias, query the database to ensure it doesn't already exist. Use an atomic
putIfAbsentoperation (DynamoDB conditional write or Cassandra lightweight transaction). - Validation: Custom aliases must match
/^[a-zA-Z0-9\-_]{3,30}$/— URL-safe, 3-30 characters, no special symbols. - Reserved words: Block aliases like "api", "admin", "health", "stats", "login" that conflict with system routes.
- Premium feature: Custom aliases can be gated behind paid tiers to prevent squatting.
URL Expiration
Expired URLs should return 404 instead of redirecting. Two strategies:
- Lazy expiration: Check
expire_atduring the redirect lookup. If expired, return 404 and delete asynchronously. Simple but leaves expired records in the database. - Active cleanup: A background cron job runs every hour, querying for URLs where
expire_at < NOW()and deleting them in batches. This reclaims storage and allows short codes to be recycled.
Best practice: use both. Lazy expiration ensures correctness on every request; active cleanup keeps the database tidy.
Analytics Pipeline
The analytics pipeline processes billions of click events per day:
Click Event → Kafka → Stream Processing → Analytics Store Enrichment steps: 1. Geo-IP lookup → country, city, region (MaxMind GeoIP) 2. User-Agent parse → browser, OS, device type (ua-parser) 3. Referrer classify → search, social, direct, email Aggregation windows: - Per-minute → real-time dashboard (Redis Sorted Sets) - Per-hour → time-series DB (InfluxDB / TimescaleDB) - Per-day → data warehouse (ClickHouse / BigQuery)
Abuse Prevention
URL shorteners are frequent targets for abuse:
- Spam/phishing: Check submitted URLs against Google Safe Browsing API and internal blocklists before shortening.
- Malware distribution: Scan destination URLs periodically. Flag and disable links that start serving malware after initial creation.
- Link enumeration: Random short codes (from KGS) prevent sequential guessing. Rate-limit redirect requests per IP.
- DDoS via redirect: Rate-limit at API Gateway. Use CDN-level DDoS protection (Cloudflare, AWS Shield).
- Click fraud: Deduplicate clicks by IP + user-agent + time window. Flag anomalous patterns (1000 clicks/sec from one IP).
▶ Key Generation Service (KGS)
Watch how app servers request batches of pre-generated keys from KGS, ensuring zero collisions even under concurrent load.
Complete Architecture Diagram
The full system design with all components, data flows, and scaling strategies:
Summary of Key Design Decisions
| Decision | Choice | Rationale |
|---|---|---|
| Short code generation | KGS | Zero collisions, fast allocation, scalable |
| Code length | 7 characters (Base62) | 3.5T possible codes, 19× headroom for 5 years |
| Redirect type | 302 (Temporary) | Enables analytics on every click |
| Database | DynamoDB / Cassandra | Key-value optimized, horizontal scaling |
| Cache | Redis (LRU, 10 GB) | Sub-5ms reads, 80/20 rule |
| Analytics | Kafka → Spark → ClickHouse | Async, doesn't affect redirect latency |
| Sharding | Consistent hashing on shortCode | Even distribution, minimal reshuffling |
Follow-Up Questions You Might Get
With KGS, each submission gets a new short code. If deduplication is important, maintain a secondary index (longUrl → shortCode) and check it before generating a new key. Trade-off: extra lookup on every write.
Q: What if KGS goes down?
Each app server has a pre-fetched batch of keys in memory (e.g., 10K keys). Even if KGS is unavailable, servers can continue shortening URLs until their batch is exhausted. KGS itself should be replicated across availability zones.
Q: How do you prevent a user from creating billions of URLs?
Per-user rate limiting (token bucket in Redis). Free tier: 100 URLs/hour. Enterprise: 100K URLs/hour. API key tied to billing tier.
Q: What about international / Unicode URLs?
Percent-encode the long URL per RFC 3986 before storing. The short code remains ASCII-only (Base62). Display the decoded URL in the analytics dashboard.
Q: How do you monitor the system?
Metrics: redirect p50/p95/p99 latency, cache hit ratio, KGS key pool size, Kafka consumer lag, error rates. Alerts on: cache hit ratio < 70%, redirect p99 > 200ms, KGS pool < 1M keys remaining.