← All Posts
High Level Design Series · Real-World Designs· Part 44 of 70

Design: Pastebin

Problem Statement

Pastebin is a web service that allows users to store and share plain-text or code snippets via a unique URL. Users paste content into a text area, receive a short link, and anyone with that link can read the content. Pastes can optionally expire, be password-protected, or be made private to the creator.

The core challenge is deceptively simple — accept text, store it, and return a link. But when you're handling 10 million pastes per day with a read-heavy workload, the system must manage massive storage volumes, serve reads with sub-100ms latency, and handle abuse at scale. The design decisions around where to store content (database vs. object storage), how to generate unique keys, how to expire old pastes efficiently, and how to cache hot content are what make this a rich system design problem.

Why Pastebin? This problem sits at the intersection of URL shortening, object storage, and content delivery. It tests your understanding of blob storage, caching layers, key generation, and TTL management — all fundamental building blocks in distributed systems.

Popular services in this space include Pastebin.com, GitHub Gist, Hastebin, and Ghostbin. They share a common architecture but differ in features like collaboration, syntax highlighting, revision history, and monetization models.

Requirements

Functional Requirements

  1. Create paste: Users can upload text/code and receive a unique short URL (e.g., pastebin.com/aB3kX9)
  2. Read paste: Anyone with the URL can view the content. The system renders the text with optional syntax highlighting
  3. Delete paste: Creators can delete their own pastes. The system also auto-deletes expired pastes
  4. Custom alias: Users may optionally specify a custom URL slug (e.g., pastebin.com/my-config)
  5. Syntax highlighting: Support language-based syntax highlighting for at least 50 languages
  6. Expiration: Pastes can have a TTL — 10 minutes, 1 hour, 1 day, 1 week, 1 month, or never
  7. Visibility: Public pastes are listed and searchable; unlisted pastes are only accessible via direct URL; private pastes require authentication
  8. User accounts (optional): Registered users can manage their pastes — view history, delete, edit

Non-Functional Requirements

  1. High availability: The system must be available 99.9% of the time (≤ 8.76 hours downtime/year)
  2. Low read latency: P99 read latency under 100ms for cached content, under 300ms for cold reads
  3. Durability: Once a paste is created, it must not be lost until it expires or is deleted
  4. Scalability: Handle 10M new pastes/day with a 5:1 read-to-write ratio
  5. Consistency: Eventual consistency is acceptable — a paste may take a few seconds to propagate to all read replicas
  6. Content size limit: Max 10 MB per paste (average ~10 KB)

Out of Scope

Capacity Estimation

Back-of-the-envelope calculations set the guardrails for our architecture. Let's work through the numbers methodically.

Traffic

MetricCalculationValue
New pastes / dayGiven10 M
Writes / second10M / 86,400~116 writes/s
Read:Write ratioGiven5:1
Reads / day10M × 550 M
Reads / second50M / 86,400~580 reads/s

At peak (assume 3× average), we need to handle ~350 writes/s and ~1,740 reads/s. These are comfortable numbers for a horizontally-scaled system with caching.

Storage

MetricCalculationValue
Average paste sizeGiven10 KB
Content storage / day10M × 10 KB~100 GB/day
Content storage / year100 GB × 365~36.5 TB/year
Content storage / 5 years36.5 TB × 5~182 TB
Total pastes in 5 years10M × 365 × 5~18.25 B
Metadata per paste~500 bytes (ID, user, lang, expiry, timestamps, visibility)~9 TB metadata in 5 years
Key insight: 182 TB of content over 5 years is far too much for a relational database, but trivial for object storage like Amazon S3 (which can hold exabytes). This is the fundamental reason we separate metadata from content: metadata goes into a database, content goes into S3/blob storage.

Bandwidth

DirectionCalculationBandwidth
Ingress (writes)116 writes/s × 10 KB~1.16 MB/s
Egress (reads)580 reads/s × 10 KB~5.8 MB/s

These bandwidth numbers are modest. Even with 3× peak multiplier, we're under 20 MB/s egress — well within the capacity of a single CDN edge node. A CDN will absorb the vast majority of read traffic for popular pastes.

Key Length

We need unique keys for 18.25 billion pastes over 5 years. Using Base62 (a–z, A–Z, 0–9):

A 6-character Base62 key gives us 56.8B combinations — more than 3× the 18.25B pastes we expect. Adding 2 characters (8 total) provides a 12,000× safety margin. We'll use 8 characters for comfortable headroom and negligible collision probability.

API Design

We'll expose a RESTful API. Authentication is via API keys for programmatic access or session tokens for web users.

Create Paste

POST /api/v1/pastes
Content-Type: application/json
Authorization: Bearer <api_key>  (optional for anonymous pastes)

{
  "content":     "def hello():\n    print('Hello, world!')",
  "title":       "My Python Snippet",       // optional
  "language":    "python",                   // optional, for syntax highlighting
  "expiration":  "1d",                       // 10m, 1h, 1d, 1w, 1m, never
  "visibility":  "unlisted",                 // public, unlisted, private
  "custom_alias": "my-snippet",              // optional custom URL slug
  "password":    "s3cret"                    // optional password protection
}

Response (201 Created):
{
  "id":          "aB3kX9Qm",
  "url":         "https://pastebin.com/aB3kX9Qm",
  "title":       "My Python Snippet",
  "language":    "python",
  "visibility":  "unlisted",
  "expires_at":  "2026-04-16T12:00:00Z",
  "created_at":  "2026-04-15T12:00:00Z",
  "size_bytes":  42
}

Read Paste

GET /api/v1/pastes/{paste_id}
Authorization: Bearer <api_key>  (required for private pastes)

Response (200 OK):
{
  "id":          "aB3kX9Qm",
  "title":       "My Python Snippet",
  "content":     "def hello():\n    print('Hello, world!')",
  "language":    "python",
  "visibility":  "unlisted",
  "view_count":  42,
  "created_at":  "2026-04-15T12:00:00Z",
  "expires_at":  "2026-04-16T12:00:00Z"
}

// For password-protected pastes:
GET /api/v1/pastes/{paste_id}?password=s3cret

Delete Paste

DELETE /api/v1/pastes/{paste_id}
Authorization: Bearer <api_key>

Response (204 No Content)

// Only the creator can delete. Attempting to delete
// someone else's paste returns 403 Forbidden.

List User Pastes

GET /api/v1/users/me/pastes?page=1&limit=20
Authorization: Bearer <api_key>

Response (200 OK):
{
  "pastes": [
    { "id": "aB3kX9Qm", "title": "My Python Snippet", ... },
    { "id": "xY7pR2Ln", "title": "Nginx Config", ... }
  ],
  "total": 147,
  "page": 1,
  "limit": 20
}

Rate Limiting Headers

Every response includes rate limiting headers:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 57
X-RateLimit-Reset: 1713184800

Database Design

We separate concerns: metadata in a database, content in object storage. This is the most critical design decision for Pastebin.

Why Not Store Content in the Database?

Paste Metadata Schema (SQL)

CREATE TABLE pastes (
    id              VARCHAR(8) PRIMARY KEY,    -- Base62 unique key
    title           VARCHAR(255),
    user_id         BIGINT,                    -- NULL for anonymous pastes
    language        VARCHAR(50) DEFAULT 'text',
    visibility      ENUM('public','unlisted','private') DEFAULT 'unlisted',
    password_hash   VARCHAR(255),              -- bcrypt hash if password-protected
    content_key     VARCHAR(255) NOT NULL,     -- S3 object key
    size_bytes      INT NOT NULL,
    view_count      BIGINT DEFAULT 0,
    created_at      DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    expires_at      DATETIME,                  -- NULL = never expires
    deleted_at      DATETIME,                  -- soft delete
    INDEX idx_user_id (user_id),
    INDEX idx_expires (expires_at),
    INDEX idx_visibility_created (visibility, created_at DESC)
);

User Schema

CREATE TABLE users (
    id              BIGINT AUTO_INCREMENT PRIMARY KEY,
    username        VARCHAR(50) UNIQUE NOT NULL,
    email           VARCHAR(255) UNIQUE NOT NULL,
    password_hash   VARCHAR(255) NOT NULL,
    api_key         VARCHAR(64) UNIQUE NOT NULL,
    tier            ENUM('free','premium') DEFAULT 'free',
    created_at      DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
    INDEX idx_api_key (api_key)
);

Storage Layout

Each paste's content is stored in object storage (S3) with the key derived from the paste ID:

s3://pastebin-content/{shard}/{paste_id}

Examples:
  s3://pastebin-content/aB/aB3kX9Qm      → shard by first 2 chars
  s3://pastebin-content/xY/xY7pR2Ln
  s3://pastebin-content/Z9/Z9mNpQ4w

Sharding the S3 prefix by the first 2 characters prevents hot-prefix issues and distributes requests across partitions. With Base62, this gives us 62² = 3,844 top-level prefixes — excellent distribution.

Database Choice: SQL vs NoSQL

FactorSQL (PostgreSQL / MySQL)NoSQL (DynamoDB / Cassandra)
SchemaFixed schema, migrations requiredFlexible schema, easy to evolve
QueriesRich queries, JOINs, aggregationsKey-value lookups, limited queries
Write scalingVertical then shardingHorizontal from day one
ConsistencyStrong (ACID)Tunable (eventual by default)
VerdictEither works. SQL for admin queries and analytics; NoSQL for pure key-value at extreme scale. We'll use PostgreSQL initially — the metadata is small (~9 TB in 5 years), relational queries are useful, and we can shard later.

High-Level Architecture

The system has five major layers:

  1. CDN layer: CloudFront / Cloudflare caches popular pastes at the edge
  2. API server layer: Stateless application servers behind a load balancer
  3. Cache layer: Redis cluster caches hot paste metadata and content
  4. Metadata store: PostgreSQL with read replicas
  5. Content store: Amazon S3 (or equivalent object storage) for the actual paste text
                        ┌─────────────┐
                        │   Clients   │
                        │ (Web / API) │
                        └──────┬──────┘
                               │
                        ┌──────▼──────┐
                        │     CDN     │ ◄── Caches GET /paste/{id} responses
                        │ CloudFront  │
                        └──────┬──────┘
                               │ cache miss
                        ┌──────▼──────┐
                        │    Load     │
                        │  Balancer   │
                        └──────┬──────┘
                               │
                 ┌─────────────┼─────────────┐
                 │             │             │
          ┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
          │  API Server │ │  API Server │ │  API Server │
          │    (N=3+)   │ │    (N=3+)   │ │    (N=3+)   │
          └──────┬──────┘ └──────┬──────┘ └──────┬──────┘
                 │               │               │
         ┌───────┼───────────────┼───────────────┼───────┐
         │       │               │               │       │
  ┌──────▼──────┐│        ┌──────▼──────┐        │┌──────▼──────┐
  │    Redis    ││        │ PostgreSQL  │        ││   Amazon    │
  │   Cluster   ││        │  (Primary)  │        ││     S3      │
  │  (Cache)    ││        │             │        ││  (Content)  │
  └─────────────┘│        └──────┬──────┘        │└─────────────┘
                 │        ┌──────▼──────┐        │
                 │        │  Read       │        │
                 │        │  Replicas   │        │
                 │        └─────────────┘        │
                 │                               │
          ┌──────▼──────┐                 ┌──────▼──────┐
          │ Key Gen Svc │                 │  Cleanup    │
          │ (pre-gen    │                 │  Worker     │
          │  unique IDs)│                 │ (expiration)│
          └─────────────┘                 └─────────────┘

Create Paste Flow

When a user creates a paste, the request flows through multiple components. Let's trace the complete write path:

  1. Client sends a POST request with paste content, language, expiration, and visibility
  2. API Server validates the request — checks content size (≤ 10 MB), rate limits, and input sanitization
  3. Key Generation Service provides a unique 8-character Base62 key (pre-generated from a pool)
  4. S3 Upload: Content is stored in S3 at key s3://pastebin-content/{shard}/{paste_id}
  5. Metadata Insert: A row is inserted into PostgreSQL with the paste ID, user ID, language, expiry, S3 key, etc.
  6. Cache Warm: The paste metadata + content are written to Redis so the first read is fast
  7. Response: The API returns the paste URL to the client

▶ Create Paste Flow

Step through the write path: from client request to stored paste and returned URL.

Why S3 before DB? We write content to S3 first because S3 is more durable (11 nines). If the DB insert fails after S3 upload, we lose a metadata row (which can be retried or cleaned up). If the DB insert succeeds but S3 fails, the paste is broken — the user has a URL pointing to missing content. Always write the hard-to-recover data first.

Read Paste Flow

Reads are the dominant traffic pattern (5× writes). We optimize the read path with multiple cache layers:

  1. CDN check: If the paste is popular and public, CloudFront serves it from the edge — no request reaches our servers
  2. Redis check: On CDN miss, the API server checks Redis for cached content
  3. Database check: On cache miss, fetch metadata from PostgreSQL (read replica)
  4. S3 fetch: Use the content_key from metadata to fetch the actual content from S3
  5. Cache populate: Store the result in Redis (with TTL) for subsequent reads
  6. Return response: Send the paste content back to the client (CDN may cache it for future requests)

▶ Read Paste Flow

Step through the read path: CDN → cache → database → object storage.

In practice, the CDN and Redis cache will absorb 80–90% of all reads. A Zipfian distribution means a small fraction of pastes receive the majority of views — and those are precisely the ones sitting in cache.

Key Generation

Generating unique, short, URL-safe keys is the same challenge as a URL shortener. Let's evaluate three approaches:

Approach 1: Hash + Truncate

Hash the content (e.g., MD5 or SHA-256) and take the first 8 characters of the Base62-encoded hash.

key = base62_encode(md5(content + timestamp + user_id))[:8]
// Example: "aB3kX9Qm"

Approach 2: Pre-Generated Key Pool (Recommended)

A dedicated Key Generation Service (KGS) pre-generates unique 8-character Base62 keys and stores them in a separate database. When an API server needs a key, it takes one from the pool.

// Key Generation Service
// Pre-generate and store in a two-table system:

CREATE TABLE unused_keys (
    key_value VARCHAR(8) PRIMARY KEY
);

CREATE TABLE used_keys (
    key_value VARCHAR(8) PRIMARY KEY
);

// Batch fetch: API server requests N keys at startup
SELECT key_value FROM unused_keys LIMIT 1000;
// Move to used_keys in a transaction
// API server keeps them in an in-memory buffer
How many keys to pre-generate? At 10M pastes/day, we use 10M keys/day. Pre-generating 100M keys (10 days' worth) takes about 3 minutes and occupies ~800 MB of storage. The KGS can replenish in the background.

Approach 3: Snowflake-style ID

Generate a 64-bit unique ID (timestamp + machine ID + sequence) and Base62-encode it. Produces 11-character strings — longer than ideal for a short URL but guaranteed unique without coordination.

Our Choice: Pre-Generated Key Pool

The KGS approach gives us the best trade-off: short keys (8 chars), zero collision probability, no per-request computation, and clean separation of concerns. Each API server fetches a batch of keys on startup and refills when its buffer runs low.

Handling Custom Aliases

When a user requests a custom alias like pastebin.com/my-config:

  1. Check if the alias is already taken in the pastes table
  2. If available, use it as the paste ID (skip the KGS)
  3. If taken, return HTTP 409 Conflict
  4. Custom aliases must be 3–30 characters, alphanumeric + hyphens only

Content Storage

The content store is the heart of Pastebin. Let's go deep on why object storage is the right choice and how to optimize it.

Why S3 / Blob Storage?

PropertyS3Database (BLOB column)
Durability99.999999999% (11 nines)Depends on backup strategy
Cost per GB/month$0.023$0.115+ (RDS gp3)
Max object size5 TB1 GB (MySQL LONGBLOB)
Throughput5,500 GET/s, 3,500 PUT/s per prefixLimited by connections
BackupsCross-region replication built-inManual snapshots
CDN integrationCloudFront origin nativelyRequires app-layer proxy

Storage Tiers for Cost Optimization

Not all pastes are accessed equally. We can use S3 lifecycle policies to move cold pastes to cheaper tiers:

// S3 Lifecycle Policy
{
  "Rules": [
    {
      "ID": "Transition-to-IA",
      "Status": "Enabled",
      "Transitions": [
        { "Days": 30,  "StorageClass": "STANDARD_IA" },  // $0.0125/GB
        { "Days": 180, "StorageClass": "GLACIER_IR" }     // $0.004/GB
      ]
    }
  ]
}

This reduces our 5-year storage cost from ~$50K/year (all Standard) to ~$12K/year (tiered). A 75% cost reduction with no impact on active pastes.

Compression

Text compresses extremely well. Using gzip or zstd before storing to S3:

Caching Strategy

With a 5:1 read:write ratio and Zipfian access patterns (a few pastes get most views), caching is critical.

Cache Layer 1: Redis

A Redis cluster sits between the API servers and the database/S3. We cache two things:

  1. Paste metadata: key → serialized metadata (small, ~500 bytes)
  2. Paste content: key → compressed content (avg ~2.5 KB after gzip)
// Redis key scheme
paste:meta:{paste_id}   → JSON metadata    TTL: 1 hour
paste:content:{paste_id} → gzipped content  TTL: 1 hour

// Cache-aside pattern (read path):
content = redis.get("paste:content:" + pasteId)
if content == null:
    metadata = db.query("SELECT * FROM pastes WHERE id = ?", pasteId)
    content = s3.getObject(metadata.content_key)
    redis.setex("paste:content:" + pasteId, 3600, content)
    redis.setex("paste:meta:" + pasteId, 3600, metadata)
return content

Cache Sizing

How much Redis memory do we need? Following the 80/20 rule: 20% of pastes account for 80% of reads.

Cache Layer 2: CDN

For public, non-expiring pastes, the CDN (CloudFront/Cloudflare) caches the rendered HTML response at edge nodes:

// CDN cache headers for public pastes
Cache-Control: public, max-age=300, s-maxage=3600
Vary: Accept-Encoding

// CDN cache headers for private/unlisted pastes
Cache-Control: private, no-store
// (CDN will not cache these)

Cache Invalidation

When a paste is deleted or updated:

  1. Delete from Redis: redis.del("paste:meta:" + id, "paste:content:" + id)
  2. Purge from CDN: cloudfront.createInvalidation("/" + pasteId)
  3. For expired pastes, the cache TTL naturally evicts them — no active invalidation needed

Paste Expiration

Expiration is essential to prevent unbounded storage growth. Our 182 TB estimate assumes no expiration — with it, actual storage will be significantly less.

Lazy Expiration (Read-Time Check)

On every read, check if the paste has expired:

metadata = fetchPasteMetadata(pasteId)
if metadata.expires_at != null && metadata.expires_at < now():
    return 404 "Paste expired"
// Serve the paste normally

This is simple and catches all expired pastes on access. But it doesn't reclaim storage — expired pastes still sit in S3 and the database until cleaned up.

Active Expiration (Background Cleanup)

A background Cleanup Worker runs periodically (every 5 minutes) to find and delete expired pastes:

-- Find expired pastes in batches
SELECT id, content_key FROM pastes
WHERE expires_at IS NOT NULL
  AND expires_at < NOW()
  AND deleted_at IS NULL
ORDER BY expires_at ASC
LIMIT 1000;

-- For each expired paste:
-- 1. Delete from S3:     s3.deleteObject(content_key)
-- 2. Delete from Redis:  redis.del("paste:meta:" + id, "paste:content:" + id)
-- 3. Soft-delete in DB:  UPDATE pastes SET deleted_at = NOW() WHERE id = ?

Why Soft Delete?

We use soft deletes (deleted_at timestamp) instead of hard deletes because:

TTL in Redis

When caching a paste with an expiration, set the Redis TTL to min(1 hour, time_until_expiry). This ensures the cache never serves stale content past the paste's expiration.

Scaling

Let's scale each component for our target: 10M writes/day, 50M reads/day, with 3× peak bursts.

API Servers

Database Scaling

PostgreSQL with read replicas handles our workload comfortably:

Database Sharding (When Needed)

When the metadata table exceeds ~1 TB (roughly after 2 years), shard by paste ID:

// Shard assignment: hash the paste ID
shard_number = hash(paste_id) % num_shards

// With 16 shards:
// Each shard holds ~1.14B pastes (after 5 years)
// Each shard is ~562 GB — comfortable for a single PostgreSQL instance

// The first 2 characters of the paste ID provide natural sharding:
// shard = first_2_chars(paste_id) % num_shards

S3 Scaling

S3 scales automatically — it's one of the most scalable services in AWS. With our prefix-based sharding (first 2 characters), we distribute load across thousands of partitions. No action needed.

Redis Scaling

CDN Scaling

The CDN absorbs the bulk of read traffic. For popular pastes (e.g., shared on social media), the CDN can serve millions of requests/second from edge nodes without any load on our origin servers.

Geographic Distribution

For global users, deploy in multiple regions:

Region          Components          Purpose
────────────────────────────────────────────────────────
us-east-1       Full stack           Primary region
eu-west-1       API + Redis + DB     European users
ap-southeast-1  API + Redis + DB     Asian users
────────────────────────────────────────────────────────
S3: Cross-region replication to all regions
CDN: Global edge network (200+ PoPs)

Abuse Prevention

Pastebin services are notoriously abused for malware distribution, credential dumps, phishing pages, and spam. A robust abuse prevention system is critical.

Rate Limiting

Apply rate limits at multiple levels:

LevelLimitImplementation
IP-based10 creates/hourRedis sliding window counter
User-based100 creates/hourToken bucket per API key
Global50K creates/hourCircuit breaker at load balancer
Read rate300 reads/min per IPCDN-level WAF rule

Spam Detection

Content Moderation

// Moderation pipeline
1. User creates paste
2. Paste is stored and available immediately (optimistic)
3. Async: Content is sent to moderation queue
4. Automated checks run:
   a. Regex patterns for credentials, SSNs, credit cards
   b. URL reputation check (Google Safe Browsing API)
   c. Spam classifier score
5. If flagged → paste is hidden, creator notified
6. If score is borderline → human review queue
7. If clean → no action needed

CAPTCHA

Anonymous paste creation requires a CAPTCHA (reCAPTCHA v3 or hCaptcha) to prevent automated spam bots. Authenticated users with good history bypass CAPTCHA.

Reporting

Every paste page includes a "Report Abuse" button. Reports go into a moderation queue with priority based on the number of unique reporters and the paste's view count.

Additional Considerations

Syntax Highlighting

Syntax highlighting is done client-side using a library like Prism.js or highlight.js. The server stores raw text — the browser renders the highlighted version. This keeps the server stateless and avoids storing rendered HTML.

// Client-side rendering
<pre><code class="language-python">
  {{ raw paste content }}
</code></pre>
<script src="prism.js"></script>

Analytics

Track per-paste view counts without hitting the database on every read:

Search

For public paste search, use Elasticsearch:

Monitoring & Alerting

Key metrics to monitor:

Complete Architecture

Putting it all together, here's the complete architecture with all components and their interactions:

                          ┌──────────────────────────────────────────────┐
                          │              CLIENTS                        │
                          │  Web Browser  │  CLI Tool  │  API Client    │
                          └──────────┬─────────────────┬───────────────┘
                                     │                 │
                          ┌──────────▼─────────────────▼───────────────┐
                          │           CDN (CloudFront)                  │
                          │  • Caches public paste responses            │
                          │  • WAF rules for rate limiting              │
                          │  • SSL termination                          │
                          │  • 200+ global edge locations               │
                          └──────────────────┬─────────────────────────┘
                                             │ cache miss
                          ┌──────────────────▼─────────────────────────┐
                          │        Load Balancer (ALB)                  │
                          │  • Health checks on API servers             │
                          │  • SSL offloading                           │
                          │  • Sticky sessions (optional)               │
                          └────┬──────────┬──────────┬─────────────────┘
                               │          │          │
                       ┌───────▼──┐ ┌─────▼────┐ ┌──▼───────┐
                       │ API Srv 1│ │ API Srv 2│ │ API Srv N│
                       │(stateless│ │(stateless│ │(stateless│
                       └───┬──┬───┘ └──┬──┬────┘ └──┬──┬────┘
                           │  │        │  │         │  │
             ┌─────────────┘  │        │  │         │  └────────────┐
             │                │        │  │         │               │
      ┌──────▼──────┐ ┌──────▼────────▼──▼─────────▼──────┐ ┌─────▼──────┐
      │ Key Gen Svc │ │         Redis Cluster              │ │  Amazon S3 │
      │             │ │  • paste:meta:{id}                 │ │            │
      │ Pre-gen pool│ │  • paste:content:{id}              │ │ /shard/id  │
      │ of 8-char   │ │  • paste:views:{id}                │ │ gzipped    │
      │ Base62 keys │ │  • ratelimit:{ip}                  │ │ content    │
      └─────────────┘ └──────────────┬─────────────────────┘ └────────────┘
                                     │ cache miss
                      ┌──────────────▼─────────────────────┐
                      │      PostgreSQL (Primary)           │
                      │  • pastes table                     │
                      │  • users table                      │
                      └──────────┬─────────────────────────┘
                      ┌──────────▼─────────────────────────┐
                      │     Read Replicas (3-5)             │
                      │  • Serve all read queries           │
                      │  • Async replication from primary   │
                      └────────────────────────────────────┘

    ┌──────────────────────────────────────────────────────────────────────┐
    │                    BACKGROUND WORKERS                                │
    │                                                                      │
    │  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐  ┌────────────┐ │
    │  │  Cleanup    │  │ View Count  │  │  Content    │  │ Analytics  │ │
    │  │  Worker     │  │  Flusher    │  │  Moderator  │  │ Aggregator │ │
    │  │ (expire     │  │ (Redis →    │  │ (spam/abuse │  │ (metrics   │ │
    │  │  pastes)    │  │  Postgres)  │  │  detection) │  │  pipeline) │ │
    │  └─────────────┘  └─────────────┘  └─────────────┘  └────────────┘ │
    └──────────────────────────────────────────────────────────────────────┘

Data Flow Summary

OperationComponents InvolvedLatency Target
Create pasteAPI → KGS → S3 → PostgreSQL → RedisP99 < 200ms
Read (CDN hit)CDN edge node onlyP99 < 20ms
Read (cache hit)API → RedisP99 < 50ms
Read (cache miss)API → Redis (miss) → PostgreSQL → S3 → RedisP99 < 300ms
Delete pasteAPI → PostgreSQL → Redis → CDN purgeP99 < 100ms

Trade-Offs & Alternatives

SQL vs DynamoDB for Metadata

We chose PostgreSQL for its rich query capabilities (admin dashboards, analytics, complex filters). DynamoDB would give us automatic sharding and single-digit-ms latency but sacrifices ad-hoc queries. At our scale (116 writes/s average), PostgreSQL is more than adequate.

S3 vs Cassandra for Content

Some designs store paste content in Cassandra (key → blob). This gives sub-10ms reads (faster than S3's 50–100ms) but at much higher operational cost. With Redis caching absorbing 85%+ of reads, the S3 latency is invisible to most users.

Separate Content vs Inline

For very small pastes (< 1 KB), storing content directly in the database row (inline) would be faster and simpler. A hybrid approach — inline for < 1 KB, S3 for everything else — optimizes both cases:

if paste.size < 1024:
    metadata.inline_content = paste.content   // store in DB
    metadata.content_key = null
else:
    s3.putObject(content_key, paste.content)  // store in S3
    metadata.inline_content = null
    metadata.content_key = content_key

Encryption at Rest

All S3 content should be encrypted (SSE-S3 or SSE-KMS). For password-protected pastes, consider additional application-layer encryption using the user's password as a key derivation input (AES-256-GCM).

Eventual Consistency Implications

With read replicas and cache layers, a newly created paste might not be immediately visible:

Summary

ComponentTechnologyPurpose
CDNCloudFront / CloudflareCache public pastes at the edge
Load BalancerAWS ALBDistribute traffic to API servers
API ServersGo / Java / Node.jsStateless request handling
Key GenerationCustom service + DBPre-generated unique 8-char keys
CacheRedis ClusterHot paste metadata + content
Metadata DBPostgreSQL + replicasPaste metadata, user accounts
Content StoreAmazon S3Paste content (gzipped)
Background WorkersCustom servicesExpiration cleanup, moderation, analytics
Key takeaways: (1) Separate metadata from content — databases for small structured data, object storage for large blobs. (2) Pre-generate keys to avoid runtime collision handling. (3) Cache aggressively — CDN for public pastes, Redis for hot content. (4) Use lifecycle policies and compression to manage storage costs. (5) Build abuse prevention from day one — Pastebin is a high-abuse-risk service.