High Level Design Series · Real-World Designs· Part 43 of 70

Design: URL Shortener (TinyURL)

April 2026 · 30 min read

Problem Statement

A URL shortener is a service that takes a long, unwieldy URL and maps it to a short, easy-to-share alias. When a user visits the short URL, the service redirects them to the original long URL. You've used these every day — bit.ly, tinyurl.com, t.co (Twitter/X), and goo.gl (now deprecated) are all URL shorteners.

For example:

Long URL:  https://www.example.com/articles/2026/04/a-very-long-title-about-system-design?ref=blog&utm_source=twitter
Short URL: https://short.ly/aB3x7Kp

Why do companies build and use URL shorteners?

Shareability — Short links are easier to paste in tweets, SMS, presentations, and print media where space is limited.
Analytics — Every redirect can be logged: who clicked, when, from where, what device. This data is enormously valuable for marketing.
Branding — Custom short domains (e.g., amzn.to, youtu.be) reinforce brand identity.
Link management — The underlying long URL can be changed without breaking the short link (useful for A/B testing, campaign rotation).
Obfuscation — Affiliate or tracking parameters are hidden from the user behind a clean short URL.

Despite appearing simple on the surface, a URL shortener at scale is a rich system design problem. It touches key generation, database design, caching, load balancing, analytics pipelines, and abuse prevention — all topics that interviewers love to dig into.

Requirements

Functional Requirements

FR-1 · Shorten URL: Given a long URL, the service generates a unique short URL.
FR-2 · Redirect: When a user accesses the short URL, the service redirects them to the original long URL.
FR-3 · Custom Aliases (optional): Users can optionally provide a custom alias (e.g., short.ly/my-brand).
FR-4 · Analytics: Track click counts, geographic data, referrer, device type per short URL.
FR-5 · Expiration: URLs can have an optional expiry time, after which the short link becomes inactive.

Non-Functional Requirements

NFR-1 · High Availability: The redirect service must be available 99.99% of the time. A short link that doesn't resolve is worse than no link at all.
NFR-2 · Low Latency: Redirects must complete in <100 ms. Users expect instant navigation — any delay and they'll assume the link is broken.
NFR-3 · Scalability: Handle 100 million new URLs per day and billions of redirects. The system must scale horizontally.
NFR-4 · Durability: Once a URL mapping is stored, it must never be lost. This is effectively a permanent record.
NFR-5 · Uniqueness: Every generated short code must be unique — no two long URLs should ever map to the same short code (no collisions).

Out of Scope

User authentication/authorization (assume API key–based access)
UI/frontend implementation
Payment/billing for premium plans

Back-of-Envelope Estimation

Before designing anything, we need to understand the scale. Let's work through the numbers carefully.

Traffic Estimates

Metric	Calculation	Result
New URLs / day	(given)	100 M
Write QPS	100M / 86,400 sec	~1,160 writes/sec
Read:Write ratio	Assume 10:1 (read-heavy)	10:1
Redirects / day	100M × 10	1 Billion
Read QPS	1B / 86,400 sec	~11,600 reads/sec

At peak (assuming 3× average): ~3,500 writes/sec and ~35,000 reads/sec. This is well within the range of a horizontally-scaled web service.

Storage Estimates

Metric	Calculation	Result
Average URL record size	shortCode (7B) + longUrl (avg 200B) + metadata (~293B)	~500 bytes
Storage / day	100M × 500B	~50 GB / day
Storage / year	50 GB × 365	~18 TB / year
Storage / 5 years	18 TB × 5	~90 TB total

Cache Estimates

Following the 80/20 rule (Pareto principle): 20% of URLs generate 80% of traffic. We should cache the hottest 20% of URLs.

Metric	Calculation	Result
Requests / day to cache	11,600 QPS × 86,400 sec × 20%	~200 M requests
Unique URLs to cache (daily)	100M × 20%	20 M URLs
Cache memory	20M × 500B	~10 GB

10 GB fits easily in a single Redis instance (typical production Redis has 64–256 GB). We can use a small Redis cluster for redundancy.

Short Code Length

We need enough unique codes for 5 years of URLs:

Total URLs in 5 years = 100M × 365 × 5 = 182.5 billion

Base62 characters: a-z (26) + A-Z (26) + 0-9 (10) = 62

6 characters: 62⁶ =   56.8 billion  — tight, might run out
7 characters: 62⁷ = 3,521.6 billion  — 19× headroom ✓

→ Use 7 characters in Base62 encoding

API Design

We expose a RESTful API. All endpoints require an API key for rate limiting and abuse prevention.

1. Create Short URL

POST /api/v1/shorten
Headers:
  Authorization: Bearer <api_key>
  Content-Type: application/json

Request Body:
{
  "longUrl":     "https://example.com/very/long/path?q=system+design",
  "customAlias": "my-link",          // optional
  "expireAt":    "2027-01-01T00:00Z" // optional, ISO 8601
}

Response: 201 Created
{
  "shortUrl":   "https://short.ly/aB3x7Kp",
  "shortCode":  "aB3x7Kp",
  "longUrl":    "https://example.com/very/long/path?q=system+design",
  "createdAt":  "2026-04-15T10:30:00Z",
  "expireAt":   "2027-01-01T00:00:00Z"
}

Rate Limit Headers:
  X-RateLimit-Limit: 100
  X-RateLimit-Remaining: 97
  X-RateLimit-Reset: 1713178800

2. Redirect (the core operation)

GET /{shortCode}
Example: GET /aB3x7Kp

Response: 301 Moved Permanently  (or 302 Found)
  Location: https://example.com/very/long/path?q=system+design

Error: 404 Not Found (if shortCode doesn't exist or is expired)

3. Get Analytics

GET /api/v1/stats/{shortCode}
Headers:
  Authorization: Bearer <api_key>

Response: 200 OK
{
  "shortCode":  "aB3x7Kp",
  "longUrl":    "https://example.com/very/long/path?q=system+design",
  "createdAt":  "2026-04-15T10:30:00Z",
  "totalClicks": 15482,
  "uniqueClicks": 9273,
  "clicksByCountry": { "US": 8201, "IN": 3104, "UK": 1422, ... },
  "clicksByDevice":  { "mobile": 9812, "desktop": 5100, "tablet": 570 },
  "clicksByDay": [
    { "date": "2026-04-15", "clicks": 3201 },
    { "date": "2026-04-16", "clicks": 2847 },
    ...
  ]
}

4. Delete Short URL

DELETE /api/v1/urls/{shortCode}
Headers:
  Authorization: Bearer <api_key>

Response: 204 No Content

Database Design

Core Tables

┌─────────────────────────────────────────┐
│             urls (primary table)         │
├─────────────────────────────────────────┤
│ id          BIGINT       PK, auto-incr  │
│ short_code  VARCHAR(10)  UNIQUE INDEX    │
│ long_url    VARCHAR(2048) NOT NULL       │
│ user_id     BIGINT       FK → users.id   │
│ created_at  DATETIME     NOT NULL        │
│ expire_at   DATETIME     NULLABLE        │
│ is_active   BOOLEAN      DEFAULT true    │
└─────────────────────────────────────────┘

┌─────────────────────────────────────────┐
│             users                        │
├─────────────────────────────────────────┤
│ id          BIGINT       PK, auto-incr  │
│ email       VARCHAR(255) UNIQUE          │
│ api_key     VARCHAR(64)  UNIQUE INDEX    │
│ tier        ENUM('free','pro','ent')     │
│ created_at  DATETIME     NOT NULL        │
└─────────────────────────────────────────┘

┌─────────────────────────────────────────┐
│          click_events (analytics)       │
├─────────────────────────────────────────┤
│ id          BIGINT       PK, auto-incr  │
│ short_code  VARCHAR(10)  INDEX           │
│ timestamp   DATETIME     NOT NULL        │
│ ip_address  VARCHAR(45)                  │
│ user_agent  VARCHAR(512)                 │
│ referrer    VARCHAR(2048)                │
│ country     VARCHAR(2)                   │
│ device_type VARCHAR(20)                  │
└─────────────────────────────────────────┘

SQL vs NoSQL Decision

The core URL lookup is a simple key-value operation: given a short code, return the long URL. This makes NoSQL databases an excellent fit:

Factor	SQL (PostgreSQL, MySQL)	NoSQL (DynamoDB, Cassandra)
Access pattern	Works, but B-tree overhead	Optimized for key-value lookup
Scale	Vertical + sharding (complex)	Horizontal scaling built-in
Consistency	Strong ACID guarantees	Eventual (tunable)
Schema flexibility	Rigid schema	Schema-less, easy evolution
Analytics queries	Rich SQL joins and aggregations	Limited (needs separate analytics DB)

Recommended approach: Use DynamoDB or Cassandra for the URL table (optimized for the high-throughput key-value lookup pattern) and a separate analytical data store (ClickHouse, BigQuery, or Redshift) for click analytics. The user table can stay in a traditional SQL database since it has lower traffic and benefits from relational queries.

High-Level Architecture

Here is the top-level architecture for our URL shortener:

                              ┌───────────────┐
                              │   CDN / Edge   │
                              │  (CloudFront)  │
                              └───────┬───────┘
                                      │
                              ┌───────▼───────┐
              ┌───────────────│  API Gateway   │───────────────┐
              │               │ (Rate Limiting)│               │
              │               └───────┬───────┘               │
              │                       │                       │
       ┌──────▼──────┐        ┌──────▼──────┐        ┌──────▼──────┐
       │  App Server  │        │  App Server  │        │  App Server  │
       │    (Node)    │        │    (Node)    │        │    (Node)    │
       └──────┬──────┘        └──────┬──────┘        └──────┬──────┘
              │                       │                       │
              └───────────┬───────────┘───────────┬───────────┘
                          │                       │
                  ┌───────▼───────┐       ┌──────▼──────┐
                  │  Redis Cache   │       │   Message    │
                  │   (Cluster)    │       │   Queue      │
                  └───────┬───────┘       │  (Kafka)     │
                          │               └──────┬──────┘
                  ┌───────▼───────┐              │
                  │   Database     │       ┌──────▼──────┐
                  │ (DynamoDB /    │       │  Analytics   │
                  │  Cassandra)    │       │  Service     │
                  └───────────────┘       └──────┬──────┘
                                                 │
                                          ┌──────▼──────┐
                                          │ ClickHouse / │
                                          │  BigQuery    │
                                          └─────────────┘

Component responsibilities:

CDN / Edge: Cache popular redirects at edge locations for sub-10ms response. Also serves static assets (docs, landing page).
API Gateway: Rate limiting, authentication (API key validation), request routing, SSL termination.
Application Servers: Stateless — handle URL shortening, redirect logic, and analytics event production. Horizontally scalable.
Redis Cache: Stores hot URL mappings (shortCode → longUrl) for fast redirects. LRU eviction.
Database: Persistent storage for all URL mappings. Cassandra or DynamoDB for horizontal scaling.
Message Queue (Kafka): Decouples the redirect path from analytics. Click events are published to Kafka asynchronously so that redirect latency isn't affected.
Analytics Service: Consumes click events from Kafka, enriches them (geo-IP lookup, device parsing), and writes to the analytics data warehouse.

Short Code Generation

This is the most critical design decision in a URL shortener. We need to generate a unique, short, URL-safe string for every new URL. There are three main approaches:

Approach 1: Base62 Encoding of Auto-Increment ID

Use a globally unique auto-incrementing counter. Convert the numeric ID to a Base62 string (characters: a-z A-Z 0-9).

ID = 123456789

Base62 encoding:
  123456789 ÷ 62 = 1991239 remainder 51 → 'Z'
  1991239   ÷ 62 = 32117   remainder 25 → 'z'
  32117     ÷ 62 = 518     remainder  1 → 'b'
  518       ÷ 62 = 8       remainder 22 → 'w'
  8         ÷ 62 = 0       remainder  8 → 'i'

Result: "iwbzZ" (read remainders bottom-up)

Alphabet: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789

Pros: Guaranteed unique (IDs never repeat), no collision checking needed, deterministic. Cons: Sequential codes are predictable (users can enumerate), requires a centralized ID generator (single point of failure), codes grow in length as IDs increase.

Approach 2: Hash + Truncate

Apply a cryptographic hash (MD5 or SHA-256) to the long URL and take the first 7 characters of the Base62-encoded hash.

longUrl = "https://example.com/long/path"
hash    = MD5(longUrl) = "e4d909c290d0fb1ca068ffaddf22cbd0"
shortCode = base62(hash[0:43bits]) = "aB3x7Kp"

Collision? → Append user ID or timestamp, re-hash
Still collision? → Append counter, re-hash

Pros: No centralized counter, same URL always generates the same short code (idempotent — useful for deduplication). Cons: Collision risk (birthday problem — with 182B URLs and 7-char codes, collisions become increasingly frequent), collision resolution adds complexity and latency.

Approach 3: Pre-Generated Key Generation Service (KGS) ✓

This is the recommended approach for production systems at scale. A dedicated Key Generation Service pre-generates random unique 7-character keys and stores them in a database.

How KGS works:

KGS pre-generates millions of random 7-character Base62 keys and stores them in a keys table with two columns: key and used (boolean).
When an application server needs keys, it requests a batch (e.g., 1,000 keys) from KGS. KGS marks those keys as used = true and hands them over.
The app server stores the batch in memory and allocates keys from it for incoming URL shortening requests.
When the in-memory batch is exhausted, the app server requests a new batch from KGS.
If an app server crashes, those unused in-memory keys are simply lost — an acceptable waste given the 3.5 trillion total possible keys.

Pros: Zero collisions (all keys are pre-generated and unique), no real-time hashing or collision checking, very fast (keys are pre-allocated in memory), horizontally scalable (each app server has its own batch). Cons: Slight key wastage on server crashes, requires maintaining the KGS service, initial key generation takes time.

Approach Comparison

Criteria	Base62 + Auto-ID	Hash + Truncate	KGS (Recommended)
Collision risk	None ✓	Medium ✗	None ✓
Predictability	Sequential ✗	Random ✓	Random ✓
Single point of failure	ID generator ✗	None ✓	KGS (mitigated) ✓
Latency	Low ✓	Medium (hash + check)	Very low ✓
Deduplication	No (same URL = different code)	Yes (same URL = same code) ✓	No (same URL = different code)
Scalability	Limited by counter	Good	Excellent ✓

▶ URL Shortening Flow

Step through the full request flow when a new URL is shortened. Watch data move between components.

301 vs 302 Redirect

When a user accesses a short URL, the server responds with a redirect. The choice between HTTP 301 and 302 is more important than it appears:

Aspect	301 — Permanent Redirect	302 — Temporary Redirect
Browser caching	Browser caches the redirect. Subsequent visits go directly to long URL.	Browser does NOT cache. Every visit hits your server.
Server load	Lower (browsers bypass server after first visit)	Higher (every click hits the server)
Analytics accuracy	Lower — cached redirects are invisible to the server	Higher — every click is logged ✓
SEO	Passes link juice to the destination URL	Link juice stays with the short URL
URL updates	Dangerous — browsers won't check for updates	Safe — changes take effect immediately ✓

Recommendation: Use 302 (Temporary Redirect) as the default. Analytics is a core feature — we need every click to hit our servers. Use 301 only for permanent, immutable links where reducing server load is the top priority (e.g., internal infrastructure links with no analytics requirement).

▶ Redirect Flow — Cache Hit vs Cache Miss

Compare the fast path (cache hit, ~5ms) vs slow path (cache miss, ~50ms) when a user clicks a short URL.

Caching Strategy

Caching is essential for a read-heavy service (10:1 read-write ratio). Our goal: serve as many redirects from cache as possible to keep p99 latency under 10ms.

Cache-Aside Pattern (Lazy Loading)

Read path: App server checks Redis first. If cache hit, return immediately. If cache miss, query the database, store the result in Redis with a TTL, then return.
Write path: On URL creation, write to DB first, then populate the cache proactively (write-through for newly created URLs since they are likely to be clicked soon).
Delete/Update path: Invalidate the cache entry, then update the database. Next read will repopulate the cache from DB.

Eviction Policy: LRU

Least Recently Used (LRU) is the ideal eviction policy. URLs that haven't been accessed recently are evicted first, naturally keeping the hottest URLs in cache. Redis supports LRU natively via maxmemory-policy allkeys-lru.

Cache Configuration

Redis Configuration:
  maxmemory:        10GB (stores ~20M URL records)
  maxmemory-policy: allkeys-lru
  TTL per entry:    24 hours (re-fetched from DB on next access)

Cache key format:   url:{shortCode}
Cache value format:  {longUrl, expireAt, isActive}

Cluster:
  3 Redis nodes (1 primary + 2 replicas) for high availability
  Consistent hashing across nodes for even distribution

Cache Warming

On a cold start (e.g., after Redis restart), the cache is empty and all requests hit the database. To mitigate this, we can implement cache warming: pre-load the top 1M most frequently accessed URLs from the analytics data into Redis during startup.

Scaling

Read Path Scaling

The redirect path is read-heavy (~12,000 QPS average, ~35,000 QPS peak). We scale reads through multiple layers:

CDN / Edge caching: Popular short URLs can be cached at CDN edge locations for sub-5ms response globally.
Redis cache cluster: 3+ Redis nodes with consistent hashing. Handles ~100K+ reads/sec.
Database read replicas: For cache misses, read replicas distribute the database load. Eventual consistency is acceptable for URL lookups (a few-hundred-millisecond replication lag is fine).
Horizontal app server scaling: Stateless app servers behind a load balancer. Auto-scale based on CPU/request count.

Database Sharding

With 90 TB over 5 years, a single database instance won't suffice. We shard by shortCode hash:

shard_id = hash(shortCode) % num_shards

Example with 16 shards:
  shortCode "aB3x7Kp" → hash → 0x3A7F → 0x3A7F % 16 = 15 → Shard 15
  shortCode "Xk9mLnQ" → hash → 0x1C42 → 0x1C42 % 16 = 2  → Shard 2

Each shard holds ~5.6 TB (90 TB / 16 shards)
Each shard handles ~730 writes/sec (11,600 / 16)

We use consistent hashing for shard assignment so that adding/removing shards only requires redistributing a fraction of the data (1/N) rather than reshuffling everything.

Rate Limiting

To prevent abuse (spamming millions of URLs, scraping short codes), we implement rate limiting at multiple levels:

Per API key: 100 URL creations per minute for free tier, 10,000 for enterprise.
Per IP address: 50 URL creations per minute for unauthenticated requests.
Global: Circuit breaker if total QPS exceeds 5× average (DDoS protection).

Implementation: Token bucket or sliding window counter in Redis. Each API key has a counter with a TTL.

Analytics Pipeline Scaling

Click events are the highest-volume data. We decouple analytics from the critical redirect path:

Redirect request
       │
       ├──→ [1] Return 302 to user (fast path, <10ms)
       │
       └──→ [2] Produce click event to Kafka (async, fire-and-forget)
                   │
                   ▼
             Kafka Topic: "click-events"
             (partitioned by shortCode for ordering)
                   │
                   ▼
             Spark Streaming / Flink Consumer
                   │
                   ├──→ Real-time aggregation → Redis counters (total clicks)
                   └──→ Batch writes → ClickHouse (detailed analytics)

Deep Dive Topics

Custom Aliases

Users may want vanity URLs like short.ly/my-brand. Handling custom aliases requires:

Uniqueness check: Before accepting a custom alias, query the database to ensure it doesn't already exist. Use an atomic putIfAbsent operation (DynamoDB conditional write or Cassandra lightweight transaction).
Validation: Custom aliases must match /^[a-zA-Z0-9\-_]{3,30}$/ — URL-safe, 3-30 characters, no special symbols.
Reserved words: Block aliases like "api", "admin", "health", "stats", "login" that conflict with system routes.
Premium feature: Custom aliases can be gated behind paid tiers to prevent squatting.

URL Expiration

Expired URLs should return 404 instead of redirecting. Two strategies:

Lazy expiration: Check expire_at during the redirect lookup. If expired, return 404 and delete asynchronously. Simple but leaves expired records in the database.
Active cleanup: A background cron job runs every hour, querying for URLs where expire_at < NOW() and deleting them in batches. This reclaims storage and allows short codes to be recycled.

Best practice: use both. Lazy expiration ensures correctness on every request; active cleanup keeps the database tidy.

Analytics Pipeline

The analytics pipeline processes billions of click events per day:

Click Event → Kafka → Stream Processing → Analytics Store

Enrichment steps:
  1. Geo-IP lookup     → country, city, region (MaxMind GeoIP)
  2. User-Agent parse  → browser, OS, device type (ua-parser)
  3. Referrer classify → search, social, direct, email

Aggregation windows:
  - Per-minute  → real-time dashboard (Redis Sorted Sets)
  - Per-hour    → time-series DB (InfluxDB / TimescaleDB)
  - Per-day     → data warehouse (ClickHouse / BigQuery)

Abuse Prevention

URL shorteners are frequent targets for abuse:

Spam/phishing: Check submitted URLs against Google Safe Browsing API and internal blocklists before shortening.
Malware distribution: Scan destination URLs periodically. Flag and disable links that start serving malware after initial creation.
Link enumeration: Random short codes (from KGS) prevent sequential guessing. Rate-limit redirect requests per IP.
DDoS via redirect: Rate-limit at API Gateway. Use CDN-level DDoS protection (Cloudflare, AWS Shield).
Click fraud: Deduplicate clicks by IP + user-agent + time window. Flag anomalous patterns (1000 clicks/sec from one IP).

▶ Key Generation Service (KGS)

Watch how app servers request batches of pre-generated keys from KGS, ensuring zero collisions even under concurrent load.

Complete Architecture Diagram

The full system design with all components, data flows, and scaling strategies:

Summary of Key Design Decisions

Decision	Choice	Rationale
Short code generation	KGS	Zero collisions, fast allocation, scalable
Code length	7 characters (Base62)	3.5T possible codes, 19× headroom for 5 years
Redirect type	302 (Temporary)	Enables analytics on every click
Database	DynamoDB / Cassandra	Key-value optimized, horizontal scaling
Cache	Redis (LRU, 10 GB)	Sub-5ms reads, 80/20 rule
Analytics	Kafka → Spark → ClickHouse	Async, doesn't affect redirect latency
Sharding	Consistent hashing on shortCode	Even distribution, minimal reshuffling

Follow-Up Questions You Might Get

Q: How do you handle the same long URL submitted twice?
With KGS, each submission gets a new short code. If deduplication is important, maintain a secondary index (longUrl → shortCode) and check it before generating a new key. Trade-off: extra lookup on every write.

Q: What if KGS goes down?
Each app server has a pre-fetched batch of keys in memory (e.g., 10K keys). Even if KGS is unavailable, servers can continue shortening URLs until their batch is exhausted. KGS itself should be replicated across availability zones.

Q: How do you prevent a user from creating billions of URLs?
Per-user rate limiting (token bucket in Redis). Free tier: 100 URLs/hour. Enterprise: 100K URLs/hour. API key tied to billing tier.

Q: What about international / Unicode URLs?
Percent-encode the long URL per RFC 3986 before storing. The short code remains ASCII-only (Base62). Display the decoded URL in the analytics dashboard.

Q: How do you monitor the system?
Metrics: redirect p50/p95/p99 latency, cache hit ratio, KGS key pool size, Kafka consumer lag, error rates. Alerts on: cache hit ratio < 70%, redirect p99 > 200ms, KGS pool < 1M keys remaining.