← All Posts
High Level Design Series · Real-World Designs· Part 43 of 70

Design: URL Shortener (TinyURL)

Problem Statement

A URL shortener is a service that takes a long, unwieldy URL and maps it to a short, easy-to-share alias. When a user visits the short URL, the service redirects them to the original long URL. You've used these every day — bit.ly, tinyurl.com, t.co (Twitter/X), and goo.gl (now deprecated) are all URL shorteners.

For example:

Long URL:  https://www.example.com/articles/2026/04/a-very-long-title-about-system-design?ref=blog&utm_source=twitter
Short URL: https://short.ly/aB3x7Kp

Why do companies build and use URL shorteners?

Despite appearing simple on the surface, a URL shortener at scale is a rich system design problem. It touches key generation, database design, caching, load balancing, analytics pipelines, and abuse prevention — all topics that interviewers love to dig into.

Requirements

Functional Requirements

FR-1 · Shorten URL: Given a long URL, the service generates a unique short URL.
FR-2 · Redirect: When a user accesses the short URL, the service redirects them to the original long URL.
FR-3 · Custom Aliases (optional): Users can optionally provide a custom alias (e.g., short.ly/my-brand).
FR-4 · Analytics: Track click counts, geographic data, referrer, device type per short URL.
FR-5 · Expiration: URLs can have an optional expiry time, after which the short link becomes inactive.

Non-Functional Requirements

NFR-1 · High Availability: The redirect service must be available 99.99% of the time. A short link that doesn't resolve is worse than no link at all.
NFR-2 · Low Latency: Redirects must complete in <100 ms. Users expect instant navigation — any delay and they'll assume the link is broken.
NFR-3 · Scalability: Handle 100 million new URLs per day and billions of redirects. The system must scale horizontally.
NFR-4 · Durability: Once a URL mapping is stored, it must never be lost. This is effectively a permanent record.
NFR-5 · Uniqueness: Every generated short code must be unique — no two long URLs should ever map to the same short code (no collisions).

Out of Scope

Back-of-Envelope Estimation

Before designing anything, we need to understand the scale. Let's work through the numbers carefully.

Traffic Estimates

MetricCalculationResult
New URLs / day(given)100 M
Write QPS100M / 86,400 sec~1,160 writes/sec
Read:Write ratioAssume 10:1 (read-heavy)10:1
Redirects / day100M × 101 Billion
Read QPS1B / 86,400 sec~11,600 reads/sec

At peak (assuming 3× average): ~3,500 writes/sec and ~35,000 reads/sec. This is well within the range of a horizontally-scaled web service.

Storage Estimates

MetricCalculationResult
Average URL record sizeshortCode (7B) + longUrl (avg 200B) + metadata (~293B)~500 bytes
Storage / day100M × 500B~50 GB / day
Storage / year50 GB × 365~18 TB / year
Storage / 5 years18 TB × 5~90 TB total

Cache Estimates

Following the 80/20 rule (Pareto principle): 20% of URLs generate 80% of traffic. We should cache the hottest 20% of URLs.

MetricCalculationResult
Requests / day to cache11,600 QPS × 86,400 sec × 20%~200 M requests
Unique URLs to cache (daily)100M × 20%20 M URLs
Cache memory20M × 500B~10 GB

10 GB fits easily in a single Redis instance (typical production Redis has 64–256 GB). We can use a small Redis cluster for redundancy.

Short Code Length

We need enough unique codes for 5 years of URLs:

Total URLs in 5 years = 100M × 365 × 5 = 182.5 billion

Base62 characters: a-z (26) + A-Z (26) + 0-9 (10) = 62

6 characters: 62⁶ =   56.8 billion  — tight, might run out
7 characters: 62⁷ = 3,521.6 billion  — 19× headroom ✓

→ Use 7 characters in Base62 encoding

API Design

We expose a RESTful API. All endpoints require an API key for rate limiting and abuse prevention.

1. Create Short URL

POST /api/v1/shorten
Headers:
  Authorization: Bearer <api_key>
  Content-Type: application/json

Request Body:
{
  "longUrl":     "https://example.com/very/long/path?q=system+design",
  "customAlias": "my-link",          // optional
  "expireAt":    "2027-01-01T00:00Z" // optional, ISO 8601
}

Response: 201 Created
{
  "shortUrl":   "https://short.ly/aB3x7Kp",
  "shortCode":  "aB3x7Kp",
  "longUrl":    "https://example.com/very/long/path?q=system+design",
  "createdAt":  "2026-04-15T10:30:00Z",
  "expireAt":   "2027-01-01T00:00:00Z"
}

Rate Limit Headers:
  X-RateLimit-Limit: 100
  X-RateLimit-Remaining: 97
  X-RateLimit-Reset: 1713178800

2. Redirect (the core operation)

GET /{shortCode}
Example: GET /aB3x7Kp

Response: 301 Moved Permanently  (or 302 Found)
  Location: https://example.com/very/long/path?q=system+design

Error: 404 Not Found (if shortCode doesn't exist or is expired)

3. Get Analytics

GET /api/v1/stats/{shortCode}
Headers:
  Authorization: Bearer <api_key>

Response: 200 OK
{
  "shortCode":  "aB3x7Kp",
  "longUrl":    "https://example.com/very/long/path?q=system+design",
  "createdAt":  "2026-04-15T10:30:00Z",
  "totalClicks": 15482,
  "uniqueClicks": 9273,
  "clicksByCountry": { "US": 8201, "IN": 3104, "UK": 1422, ... },
  "clicksByDevice":  { "mobile": 9812, "desktop": 5100, "tablet": 570 },
  "clicksByDay": [
    { "date": "2026-04-15", "clicks": 3201 },
    { "date": "2026-04-16", "clicks": 2847 },
    ...
  ]
}

4. Delete Short URL

DELETE /api/v1/urls/{shortCode}
Headers:
  Authorization: Bearer <api_key>

Response: 204 No Content

Database Design

Core Tables

┌─────────────────────────────────────────┐
│             urls (primary table)         │
├─────────────────────────────────────────┤
│ id          BIGINT       PK, auto-incr  │
│ short_code  VARCHAR(10)  UNIQUE INDEX    │
│ long_url    VARCHAR(2048) NOT NULL       │
│ user_id     BIGINT       FK → users.id   │
│ created_at  DATETIME     NOT NULL        │
│ expire_at   DATETIME     NULLABLE        │
│ is_active   BOOLEAN      DEFAULT true    │
└─────────────────────────────────────────┘

┌─────────────────────────────────────────┐
│             users                        │
├─────────────────────────────────────────┤
│ id          BIGINT       PK, auto-incr  │
│ email       VARCHAR(255) UNIQUE          │
│ api_key     VARCHAR(64)  UNIQUE INDEX    │
│ tier        ENUM('free','pro','ent')     │
│ created_at  DATETIME     NOT NULL        │
└─────────────────────────────────────────┘

┌─────────────────────────────────────────┐
│          click_events (analytics)       │
├─────────────────────────────────────────┤
│ id          BIGINT       PK, auto-incr  │
│ short_code  VARCHAR(10)  INDEX           │
│ timestamp   DATETIME     NOT NULL        │
│ ip_address  VARCHAR(45)                  │
│ user_agent  VARCHAR(512)                 │
│ referrer    VARCHAR(2048)                │
│ country     VARCHAR(2)                   │
│ device_type VARCHAR(20)                  │
└─────────────────────────────────────────┘

SQL vs NoSQL Decision

The core URL lookup is a simple key-value operation: given a short code, return the long URL. This makes NoSQL databases an excellent fit:

FactorSQL (PostgreSQL, MySQL)NoSQL (DynamoDB, Cassandra)
Access patternWorks, but B-tree overheadOptimized for key-value lookup
ScaleVertical + sharding (complex)Horizontal scaling built-in
ConsistencyStrong ACID guaranteesEventual (tunable)
Schema flexibilityRigid schemaSchema-less, easy evolution
Analytics queriesRich SQL joins and aggregationsLimited (needs separate analytics DB)
Recommended approach: Use DynamoDB or Cassandra for the URL table (optimized for the high-throughput key-value lookup pattern) and a separate analytical data store (ClickHouse, BigQuery, or Redshift) for click analytics. The user table can stay in a traditional SQL database since it has lower traffic and benefits from relational queries.

High-Level Architecture

Here is the top-level architecture for our URL shortener:

                              ┌───────────────┐
                              │   CDN / Edge   │
                              │  (CloudFront)  │
                              └───────┬───────┘
                                      │
                              ┌───────▼───────┐
              ┌───────────────│  API Gateway   │───────────────┐
              │               │ (Rate Limiting)│               │
              │               └───────┬───────┘               │
              │                       │                       │
       ┌──────▼──────┐        ┌──────▼──────┐        ┌──────▼──────┐
       │  App Server  │        │  App Server  │        │  App Server  │
       │    (Node)    │        │    (Node)    │        │    (Node)    │
       └──────┬──────┘        └──────┬──────┘        └──────┬──────┘
              │                       │                       │
              └───────────┬───────────┘───────────┬───────────┘
                          │                       │
                  ┌───────▼───────┐       ┌──────▼──────┐
                  │  Redis Cache   │       │   Message    │
                  │   (Cluster)    │       │   Queue      │
                  └───────┬───────┘       │  (Kafka)     │
                          │               └──────┬──────┘
                  ┌───────▼───────┐              │
                  │   Database     │       ┌──────▼──────┐
                  │ (DynamoDB /    │       │  Analytics   │
                  │  Cassandra)    │       │  Service     │
                  └───────────────┘       └──────┬──────┘
                                                 │
                                          ┌──────▼──────┐
                                          │ ClickHouse / │
                                          │  BigQuery    │
                                          └─────────────┘

Component responsibilities:

Short Code Generation

This is the most critical design decision in a URL shortener. We need to generate a unique, short, URL-safe string for every new URL. There are three main approaches:

Approach 1: Base62 Encoding of Auto-Increment ID

Use a globally unique auto-incrementing counter. Convert the numeric ID to a Base62 string (characters: a-z A-Z 0-9).

ID = 123456789

Base62 encoding:
  123456789 ÷ 62 = 1991239 remainder 51 → 'Z'
  1991239   ÷ 62 = 32117   remainder 25 → 'z'
  32117     ÷ 62 = 518     remainder  1 → 'b'
  518       ÷ 62 = 8       remainder 22 → 'w'
  8         ÷ 62 = 0       remainder  8 → 'i'

Result: "iwbzZ" (read remainders bottom-up)

Alphabet: abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789

Pros: Guaranteed unique (IDs never repeat), no collision checking needed, deterministic. Cons: Sequential codes are predictable (users can enumerate), requires a centralized ID generator (single point of failure), codes grow in length as IDs increase.

Approach 2: Hash + Truncate

Apply a cryptographic hash (MD5 or SHA-256) to the long URL and take the first 7 characters of the Base62-encoded hash.

longUrl = "https://example.com/long/path"
hash    = MD5(longUrl) = "e4d909c290d0fb1ca068ffaddf22cbd0"
shortCode = base62(hash[0:43bits]) = "aB3x7Kp"

Collision? → Append user ID or timestamp, re-hash
Still collision? → Append counter, re-hash

Pros: No centralized counter, same URL always generates the same short code (idempotent — useful for deduplication). Cons: Collision risk (birthday problem — with 182B URLs and 7-char codes, collisions become increasingly frequent), collision resolution adds complexity and latency.

Approach 3: Pre-Generated Key Generation Service (KGS) ✓

This is the recommended approach for production systems at scale. A dedicated Key Generation Service pre-generates random unique 7-character keys and stores them in a database.

How KGS works:
  1. KGS pre-generates millions of random 7-character Base62 keys and stores them in a keys table with two columns: key and used (boolean).
  2. When an application server needs keys, it requests a batch (e.g., 1,000 keys) from KGS. KGS marks those keys as used = true and hands them over.
  3. The app server stores the batch in memory and allocates keys from it for incoming URL shortening requests.
  4. When the in-memory batch is exhausted, the app server requests a new batch from KGS.
  5. If an app server crashes, those unused in-memory keys are simply lost — an acceptable waste given the 3.5 trillion total possible keys.

Pros: Zero collisions (all keys are pre-generated and unique), no real-time hashing or collision checking, very fast (keys are pre-allocated in memory), horizontally scalable (each app server has its own batch). Cons: Slight key wastage on server crashes, requires maintaining the KGS service, initial key generation takes time.

Approach Comparison

CriteriaBase62 + Auto-IDHash + TruncateKGS (Recommended)
Collision riskNone ✓Medium ✗None ✓
PredictabilitySequential ✗Random ✓Random ✓
Single point of failureID generator ✗None ✓KGS (mitigated) ✓
LatencyLow ✓Medium (hash + check)Very low ✓
DeduplicationNo (same URL = different code)Yes (same URL = same code) ✓No (same URL = different code)
ScalabilityLimited by counterGoodExcellent ✓

▶ URL Shortening Flow

Step through the full request flow when a new URL is shortened. Watch data move between components.

301 vs 302 Redirect

When a user accesses a short URL, the server responds with a redirect. The choice between HTTP 301 and 302 is more important than it appears:

Aspect301 — Permanent Redirect302 — Temporary Redirect
Browser cachingBrowser caches the redirect. Subsequent visits go directly to long URL.Browser does NOT cache. Every visit hits your server.
Server loadLower (browsers bypass server after first visit)Higher (every click hits the server)
Analytics accuracyLower — cached redirects are invisible to the serverHigher — every click is logged ✓
SEOPasses link juice to the destination URLLink juice stays with the short URL
URL updatesDangerous — browsers won't check for updatesSafe — changes take effect immediately ✓
Recommendation: Use 302 (Temporary Redirect) as the default. Analytics is a core feature — we need every click to hit our servers. Use 301 only for permanent, immutable links where reducing server load is the top priority (e.g., internal infrastructure links with no analytics requirement).

▶ Redirect Flow — Cache Hit vs Cache Miss

Compare the fast path (cache hit, ~5ms) vs slow path (cache miss, ~50ms) when a user clicks a short URL.

Caching Strategy

Caching is essential for a read-heavy service (10:1 read-write ratio). Our goal: serve as many redirects from cache as possible to keep p99 latency under 10ms.

Cache-Aside Pattern (Lazy Loading)

  1. Read path: App server checks Redis first. If cache hit, return immediately. If cache miss, query the database, store the result in Redis with a TTL, then return.
  2. Write path: On URL creation, write to DB first, then populate the cache proactively (write-through for newly created URLs since they are likely to be clicked soon).
  3. Delete/Update path: Invalidate the cache entry, then update the database. Next read will repopulate the cache from DB.

Eviction Policy: LRU

Least Recently Used (LRU) is the ideal eviction policy. URLs that haven't been accessed recently are evicted first, naturally keeping the hottest URLs in cache. Redis supports LRU natively via maxmemory-policy allkeys-lru.

Cache Configuration

Redis Configuration:
  maxmemory:        10GB (stores ~20M URL records)
  maxmemory-policy: allkeys-lru
  TTL per entry:    24 hours (re-fetched from DB on next access)

Cache key format:   url:{shortCode}
Cache value format:  {longUrl, expireAt, isActive}

Cluster:
  3 Redis nodes (1 primary + 2 replicas) for high availability
  Consistent hashing across nodes for even distribution

Cache Warming

On a cold start (e.g., after Redis restart), the cache is empty and all requests hit the database. To mitigate this, we can implement cache warming: pre-load the top 1M most frequently accessed URLs from the analytics data into Redis during startup.

Scaling

Read Path Scaling

The redirect path is read-heavy (~12,000 QPS average, ~35,000 QPS peak). We scale reads through multiple layers:

  1. CDN / Edge caching: Popular short URLs can be cached at CDN edge locations for sub-5ms response globally.
  2. Redis cache cluster: 3+ Redis nodes with consistent hashing. Handles ~100K+ reads/sec.
  3. Database read replicas: For cache misses, read replicas distribute the database load. Eventual consistency is acceptable for URL lookups (a few-hundred-millisecond replication lag is fine).
  4. Horizontal app server scaling: Stateless app servers behind a load balancer. Auto-scale based on CPU/request count.

Database Sharding

With 90 TB over 5 years, a single database instance won't suffice. We shard by shortCode hash:

shard_id = hash(shortCode) % num_shards

Example with 16 shards:
  shortCode "aB3x7Kp" → hash → 0x3A7F → 0x3A7F % 16 = 15 → Shard 15
  shortCode "Xk9mLnQ" → hash → 0x1C42 → 0x1C42 % 16 = 2  → Shard 2

Each shard holds ~5.6 TB (90 TB / 16 shards)
Each shard handles ~730 writes/sec (11,600 / 16)

We use consistent hashing for shard assignment so that adding/removing shards only requires redistributing a fraction of the data (1/N) rather than reshuffling everything.

Rate Limiting

To prevent abuse (spamming millions of URLs, scraping short codes), we implement rate limiting at multiple levels:

Implementation: Token bucket or sliding window counter in Redis. Each API key has a counter with a TTL.

Analytics Pipeline Scaling

Click events are the highest-volume data. We decouple analytics from the critical redirect path:

Redirect request
       │
       ├──→ [1] Return 302 to user (fast path, <10ms)
       │
       └──→ [2] Produce click event to Kafka (async, fire-and-forget)
                   │
                   ▼
             Kafka Topic: "click-events"
             (partitioned by shortCode for ordering)
                   │
                   ▼
             Spark Streaming / Flink Consumer
                   │
                   ├──→ Real-time aggregation → Redis counters (total clicks)
                   └──→ Batch writes → ClickHouse (detailed analytics)

Deep Dive Topics

Custom Aliases

Users may want vanity URLs like short.ly/my-brand. Handling custom aliases requires:

  1. Uniqueness check: Before accepting a custom alias, query the database to ensure it doesn't already exist. Use an atomic putIfAbsent operation (DynamoDB conditional write or Cassandra lightweight transaction).
  2. Validation: Custom aliases must match /^[a-zA-Z0-9\-_]{3,30}$/ — URL-safe, 3-30 characters, no special symbols.
  3. Reserved words: Block aliases like "api", "admin", "health", "stats", "login" that conflict with system routes.
  4. Premium feature: Custom aliases can be gated behind paid tiers to prevent squatting.

URL Expiration

Expired URLs should return 404 instead of redirecting. Two strategies:

Best practice: use both. Lazy expiration ensures correctness on every request; active cleanup keeps the database tidy.

Analytics Pipeline

The analytics pipeline processes billions of click events per day:

Click Event → Kafka → Stream Processing → Analytics Store

Enrichment steps:
  1. Geo-IP lookup     → country, city, region (MaxMind GeoIP)
  2. User-Agent parse  → browser, OS, device type (ua-parser)
  3. Referrer classify → search, social, direct, email

Aggregation windows:
  - Per-minute  → real-time dashboard (Redis Sorted Sets)
  - Per-hour    → time-series DB (InfluxDB / TimescaleDB)
  - Per-day     → data warehouse (ClickHouse / BigQuery)

Abuse Prevention

URL shorteners are frequent targets for abuse:

▶ Key Generation Service (KGS)

Watch how app servers request batches of pre-generated keys from KGS, ensuring zero collisions even under concurrent load.

Complete Architecture Diagram

The full system design with all components, data flows, and scaling strategies:

URL Shortener — Complete Architecture Client CDN / Edge (CloudFront) API Gateway Rate Limit · Auth · SSL App Server 1 App Server 2 App Server 3 Key Generation Service (KGS) key batches Redis Cache LRU · 10 GB · 20M URLs miss DynamoDB / Cassandra Sharded by shortCode hash Read Replicas Apache Kafka click-events topic Analytics Service Spark / Flink ClickHouse / BigQuery Analytical Dashboard real-time counters Legend: Read path Analytics (async) KGS

Summary of Key Design Decisions

DecisionChoiceRationale
Short code generationKGSZero collisions, fast allocation, scalable
Code length7 characters (Base62)3.5T possible codes, 19× headroom for 5 years
Redirect type302 (Temporary)Enables analytics on every click
DatabaseDynamoDB / CassandraKey-value optimized, horizontal scaling
CacheRedis (LRU, 10 GB)Sub-5ms reads, 80/20 rule
AnalyticsKafka → Spark → ClickHouseAsync, doesn't affect redirect latency
ShardingConsistent hashing on shortCodeEven distribution, minimal reshuffling

Follow-Up Questions You Might Get

Q: How do you handle the same long URL submitted twice?
With KGS, each submission gets a new short code. If deduplication is important, maintain a secondary index (longUrl → shortCode) and check it before generating a new key. Trade-off: extra lookup on every write.

Q: What if KGS goes down?
Each app server has a pre-fetched batch of keys in memory (e.g., 10K keys). Even if KGS is unavailable, servers can continue shortening URLs until their batch is exhausted. KGS itself should be replicated across availability zones.

Q: How do you prevent a user from creating billions of URLs?
Per-user rate limiting (token bucket in Redis). Free tier: 100 URLs/hour. Enterprise: 100K URLs/hour. API key tied to billing tier.

Q: What about international / Unicode URLs?
Percent-encode the long URL per RFC 3986 before storing. The short code remains ASCII-only (Base62). Display the decoded URL in the analytics dashboard.

Q: How do you monitor the system?
Metrics: redirect p50/p95/p99 latency, cache hit ratio, KGS key pool size, Kafka consumer lag, error rates. Alerts on: cache hit ratio < 70%, redirect p99 > 200ms, KGS pool < 1M keys remaining.