Design: Pastebin
Problem Statement
Pastebin is a web service that allows users to store and share plain-text or code snippets via a unique URL. Users paste content into a text area, receive a short link, and anyone with that link can read the content. Pastes can optionally expire, be password-protected, or be made private to the creator.
The core challenge is deceptively simple — accept text, store it, and return a link. But when you're handling 10 million pastes per day with a read-heavy workload, the system must manage massive storage volumes, serve reads with sub-100ms latency, and handle abuse at scale. The design decisions around where to store content (database vs. object storage), how to generate unique keys, how to expire old pastes efficiently, and how to cache hot content are what make this a rich system design problem.
Popular services in this space include Pastebin.com, GitHub Gist, Hastebin, and Ghostbin. They share a common architecture but differ in features like collaboration, syntax highlighting, revision history, and monetization models.
Requirements
Functional Requirements
- Create paste: Users can upload text/code and receive a unique short URL (e.g.,
pastebin.com/aB3kX9) - Read paste: Anyone with the URL can view the content. The system renders the text with optional syntax highlighting
- Delete paste: Creators can delete their own pastes. The system also auto-deletes expired pastes
- Custom alias: Users may optionally specify a custom URL slug (e.g.,
pastebin.com/my-config) - Syntax highlighting: Support language-based syntax highlighting for at least 50 languages
- Expiration: Pastes can have a TTL — 10 minutes, 1 hour, 1 day, 1 week, 1 month, or never
- Visibility: Public pastes are listed and searchable; unlisted pastes are only accessible via direct URL; private pastes require authentication
- User accounts (optional): Registered users can manage their pastes — view history, delete, edit
Non-Functional Requirements
- High availability: The system must be available 99.9% of the time (≤ 8.76 hours downtime/year)
- Low read latency: P99 read latency under 100ms for cached content, under 300ms for cold reads
- Durability: Once a paste is created, it must not be lost until it expires or is deleted
- Scalability: Handle 10M new pastes/day with a 5:1 read-to-write ratio
- Consistency: Eventual consistency is acceptable — a paste may take a few seconds to propagate to all read replicas
- Content size limit: Max 10 MB per paste (average ~10 KB)
Out of Scope
- Real-time collaborative editing (Google Docs-style)
- Version control / diff between edits
- File uploads (images, binaries)
- Commenting or social features
Capacity Estimation
Back-of-the-envelope calculations set the guardrails for our architecture. Let's work through the numbers methodically.
Traffic
| Metric | Calculation | Value |
|---|---|---|
| New pastes / day | Given | 10 M |
| Writes / second | 10M / 86,400 | ~116 writes/s |
| Read:Write ratio | Given | 5:1 |
| Reads / day | 10M × 5 | 50 M |
| Reads / second | 50M / 86,400 | ~580 reads/s |
At peak (assume 3× average), we need to handle ~350 writes/s and ~1,740 reads/s. These are comfortable numbers for a horizontally-scaled system with caching.
Storage
| Metric | Calculation | Value |
|---|---|---|
| Average paste size | Given | 10 KB |
| Content storage / day | 10M × 10 KB | ~100 GB/day |
| Content storage / year | 100 GB × 365 | ~36.5 TB/year |
| Content storage / 5 years | 36.5 TB × 5 | ~182 TB |
| Total pastes in 5 years | 10M × 365 × 5 | ~18.25 B |
| Metadata per paste | ~500 bytes (ID, user, lang, expiry, timestamps, visibility) | ~9 TB metadata in 5 years |
Bandwidth
| Direction | Calculation | Bandwidth |
|---|---|---|
| Ingress (writes) | 116 writes/s × 10 KB | ~1.16 MB/s |
| Egress (reads) | 580 reads/s × 10 KB | ~5.8 MB/s |
These bandwidth numbers are modest. Even with 3× peak multiplier, we're under 20 MB/s egress — well within the capacity of a single CDN edge node. A CDN will absorb the vast majority of read traffic for popular pastes.
Key Length
We need unique keys for 18.25 billion pastes over 5 years. Using Base62 (a–z, A–Z, 0–9):
- 6 characters → 62⁶ = 56.8 billion combinations
- 7 characters → 62⁷ = 3.52 trillion combinations
- 8 characters → 62⁸ = 218 trillion combinations
A 6-character Base62 key gives us 56.8B combinations — more than 3× the 18.25B pastes we expect. Adding 2 characters (8 total) provides a 12,000× safety margin. We'll use 8 characters for comfortable headroom and negligible collision probability.
API Design
We'll expose a RESTful API. Authentication is via API keys for programmatic access or session tokens for web users.
Create Paste
POST /api/v1/pastes
Content-Type: application/json
Authorization: Bearer <api_key> (optional for anonymous pastes)
{
"content": "def hello():\n print('Hello, world!')",
"title": "My Python Snippet", // optional
"language": "python", // optional, for syntax highlighting
"expiration": "1d", // 10m, 1h, 1d, 1w, 1m, never
"visibility": "unlisted", // public, unlisted, private
"custom_alias": "my-snippet", // optional custom URL slug
"password": "s3cret" // optional password protection
}
Response (201 Created):
{
"id": "aB3kX9Qm",
"url": "https://pastebin.com/aB3kX9Qm",
"title": "My Python Snippet",
"language": "python",
"visibility": "unlisted",
"expires_at": "2026-04-16T12:00:00Z",
"created_at": "2026-04-15T12:00:00Z",
"size_bytes": 42
}
Read Paste
GET /api/v1/pastes/{paste_id}
Authorization: Bearer <api_key> (required for private pastes)
Response (200 OK):
{
"id": "aB3kX9Qm",
"title": "My Python Snippet",
"content": "def hello():\n print('Hello, world!')",
"language": "python",
"visibility": "unlisted",
"view_count": 42,
"created_at": "2026-04-15T12:00:00Z",
"expires_at": "2026-04-16T12:00:00Z"
}
// For password-protected pastes:
GET /api/v1/pastes/{paste_id}?password=s3cret
Delete Paste
DELETE /api/v1/pastes/{paste_id}
Authorization: Bearer <api_key>
Response (204 No Content)
// Only the creator can delete. Attempting to delete
// someone else's paste returns 403 Forbidden.
List User Pastes
GET /api/v1/users/me/pastes?page=1&limit=20
Authorization: Bearer <api_key>
Response (200 OK):
{
"pastes": [
{ "id": "aB3kX9Qm", "title": "My Python Snippet", ... },
{ "id": "xY7pR2Ln", "title": "Nginx Config", ... }
],
"total": 147,
"page": 1,
"limit": 20
}
Rate Limiting Headers
Every response includes rate limiting headers:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 57
X-RateLimit-Reset: 1713184800
- Anonymous users: 10 creates/hour, 60 reads/minute
- Registered users: 100 creates/hour, 300 reads/minute
- Premium users: 1000 creates/hour, 3000 reads/minute
Database Design
We separate concerns: metadata in a database, content in object storage. This is the most critical design decision for Pastebin.
Why Not Store Content in the Database?
- Size: 182 TB of content in 5 years — relational databases struggle with this volume, and storage costs are 5–10× higher than S3
- Performance: Large TEXT/BLOB columns degrade query performance and make backups painfully slow
- Scaling: Sharding a database with large blobs is far harder than sharding metadata-only tables
- Cost: S3 costs ~$0.023/GB/month. RDS storage costs ~$0.115/GB/month — a 5× difference
Paste Metadata Schema (SQL)
CREATE TABLE pastes (
id VARCHAR(8) PRIMARY KEY, -- Base62 unique key
title VARCHAR(255),
user_id BIGINT, -- NULL for anonymous pastes
language VARCHAR(50) DEFAULT 'text',
visibility ENUM('public','unlisted','private') DEFAULT 'unlisted',
password_hash VARCHAR(255), -- bcrypt hash if password-protected
content_key VARCHAR(255) NOT NULL, -- S3 object key
size_bytes INT NOT NULL,
view_count BIGINT DEFAULT 0,
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
expires_at DATETIME, -- NULL = never expires
deleted_at DATETIME, -- soft delete
INDEX idx_user_id (user_id),
INDEX idx_expires (expires_at),
INDEX idx_visibility_created (visibility, created_at DESC)
);
User Schema
CREATE TABLE users (
id BIGINT AUTO_INCREMENT PRIMARY KEY,
username VARCHAR(50) UNIQUE NOT NULL,
email VARCHAR(255) UNIQUE NOT NULL,
password_hash VARCHAR(255) NOT NULL,
api_key VARCHAR(64) UNIQUE NOT NULL,
tier ENUM('free','premium') DEFAULT 'free',
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
INDEX idx_api_key (api_key)
);
Storage Layout
Each paste's content is stored in object storage (S3) with the key derived from the paste ID:
s3://pastebin-content/{shard}/{paste_id}
Examples:
s3://pastebin-content/aB/aB3kX9Qm → shard by first 2 chars
s3://pastebin-content/xY/xY7pR2Ln
s3://pastebin-content/Z9/Z9mNpQ4w
Sharding the S3 prefix by the first 2 characters prevents hot-prefix issues and distributes requests across partitions. With Base62, this gives us 62² = 3,844 top-level prefixes — excellent distribution.
Database Choice: SQL vs NoSQL
| Factor | SQL (PostgreSQL / MySQL) | NoSQL (DynamoDB / Cassandra) |
|---|---|---|
| Schema | Fixed schema, migrations required | Flexible schema, easy to evolve |
| Queries | Rich queries, JOINs, aggregations | Key-value lookups, limited queries |
| Write scaling | Vertical then sharding | Horizontal from day one |
| Consistency | Strong (ACID) | Tunable (eventual by default) |
| Verdict | Either works. SQL for admin queries and analytics; NoSQL for pure key-value at extreme scale. We'll use PostgreSQL initially — the metadata is small (~9 TB in 5 years), relational queries are useful, and we can shard later. | |
High-Level Architecture
The system has five major layers:
- CDN layer: CloudFront / Cloudflare caches popular pastes at the edge
- API server layer: Stateless application servers behind a load balancer
- Cache layer: Redis cluster caches hot paste metadata and content
- Metadata store: PostgreSQL with read replicas
- Content store: Amazon S3 (or equivalent object storage) for the actual paste text
┌─────────────┐
│ Clients │
│ (Web / API) │
└──────┬──────┘
│
┌──────▼──────┐
│ CDN │ ◄── Caches GET /paste/{id} responses
│ CloudFront │
└──────┬──────┘
│ cache miss
┌──────▼──────┐
│ Load │
│ Balancer │
└──────┬──────┘
│
┌─────────────┼─────────────┐
│ │ │
┌──────▼──────┐ ┌──────▼──────┐ ┌──────▼──────┐
│ API Server │ │ API Server │ │ API Server │
│ (N=3+) │ │ (N=3+) │ │ (N=3+) │
└──────┬──────┘ └──────┬──────┘ └──────┬──────┘
│ │ │
┌───────┼───────────────┼───────────────┼───────┐
│ │ │ │ │
┌──────▼──────┐│ ┌──────▼──────┐ │┌──────▼──────┐
│ Redis ││ │ PostgreSQL │ ││ Amazon │
│ Cluster ││ │ (Primary) │ ││ S3 │
│ (Cache) ││ │ │ ││ (Content) │
└─────────────┘│ └──────┬──────┘ │└─────────────┘
│ ┌──────▼──────┐ │
│ │ Read │ │
│ │ Replicas │ │
│ └─────────────┘ │
│ │
┌──────▼──────┐ ┌──────▼──────┐
│ Key Gen Svc │ │ Cleanup │
│ (pre-gen │ │ Worker │
│ unique IDs)│ │ (expiration)│
└─────────────┘ └─────────────┘
Create Paste Flow
When a user creates a paste, the request flows through multiple components. Let's trace the complete write path:
- Client sends a POST request with paste content, language, expiration, and visibility
- API Server validates the request — checks content size (≤ 10 MB), rate limits, and input sanitization
- Key Generation Service provides a unique 8-character Base62 key (pre-generated from a pool)
- S3 Upload: Content is stored in S3 at key
s3://pastebin-content/{shard}/{paste_id} - Metadata Insert: A row is inserted into PostgreSQL with the paste ID, user ID, language, expiry, S3 key, etc.
- Cache Warm: The paste metadata + content are written to Redis so the first read is fast
- Response: The API returns the paste URL to the client
▶ Create Paste Flow
Step through the write path: from client request to stored paste and returned URL.
Read Paste Flow
Reads are the dominant traffic pattern (5× writes). We optimize the read path with multiple cache layers:
- CDN check: If the paste is popular and public, CloudFront serves it from the edge — no request reaches our servers
- Redis check: On CDN miss, the API server checks Redis for cached content
- Database check: On cache miss, fetch metadata from PostgreSQL (read replica)
- S3 fetch: Use the
content_keyfrom metadata to fetch the actual content from S3 - Cache populate: Store the result in Redis (with TTL) for subsequent reads
- Return response: Send the paste content back to the client (CDN may cache it for future requests)
▶ Read Paste Flow
Step through the read path: CDN → cache → database → object storage.
In practice, the CDN and Redis cache will absorb 80–90% of all reads. A Zipfian distribution means a small fraction of pastes receive the majority of views — and those are precisely the ones sitting in cache.
Key Generation
Generating unique, short, URL-safe keys is the same challenge as a URL shortener. Let's evaluate three approaches:
Approach 1: Hash + Truncate
Hash the content (e.g., MD5 or SHA-256) and take the first 8 characters of the Base62-encoded hash.
key = base62_encode(md5(content + timestamp + user_id))[:8]
// Example: "aB3kX9Qm"
- Pros: Simple, deterministic for same content
- Cons: Collisions possible (birthday paradox — ~50% chance of collision after 62⁴ = 14.7M keys with 8 chars). Need collision handling loop
Approach 2: Pre-Generated Key Pool (Recommended)
A dedicated Key Generation Service (KGS) pre-generates unique 8-character Base62 keys and stores them in a separate database. When an API server needs a key, it takes one from the pool.
// Key Generation Service
// Pre-generate and store in a two-table system:
CREATE TABLE unused_keys (
key_value VARCHAR(8) PRIMARY KEY
);
CREATE TABLE used_keys (
key_value VARCHAR(8) PRIMARY KEY
);
// Batch fetch: API server requests N keys at startup
SELECT key_value FROM unused_keys LIMIT 1000;
// Move to used_keys in a transaction
// API server keeps them in an in-memory buffer
- Pros: Zero collisions (keys are unique by construction), very fast (in-memory buffer), no runtime computation
- Cons: Extra service to manage, need to pre-populate keys
Approach 3: Snowflake-style ID
Generate a 64-bit unique ID (timestamp + machine ID + sequence) and Base62-encode it. Produces 11-character strings — longer than ideal for a short URL but guaranteed unique without coordination.
Our Choice: Pre-Generated Key Pool
The KGS approach gives us the best trade-off: short keys (8 chars), zero collision probability, no per-request computation, and clean separation of concerns. Each API server fetches a batch of keys on startup and refills when its buffer runs low.
Handling Custom Aliases
When a user requests a custom alias like pastebin.com/my-config:
- Check if the alias is already taken in the
pastestable - If available, use it as the paste ID (skip the KGS)
- If taken, return HTTP 409 Conflict
- Custom aliases must be 3–30 characters, alphanumeric + hyphens only
Content Storage
The content store is the heart of Pastebin. Let's go deep on why object storage is the right choice and how to optimize it.
Why S3 / Blob Storage?
| Property | S3 | Database (BLOB column) |
|---|---|---|
| Durability | 99.999999999% (11 nines) | Depends on backup strategy |
| Cost per GB/month | $0.023 | $0.115+ (RDS gp3) |
| Max object size | 5 TB | 1 GB (MySQL LONGBLOB) |
| Throughput | 5,500 GET/s, 3,500 PUT/s per prefix | Limited by connections |
| Backups | Cross-region replication built-in | Manual snapshots |
| CDN integration | CloudFront origin natively | Requires app-layer proxy |
Storage Tiers for Cost Optimization
Not all pastes are accessed equally. We can use S3 lifecycle policies to move cold pastes to cheaper tiers:
// S3 Lifecycle Policy
{
"Rules": [
{
"ID": "Transition-to-IA",
"Status": "Enabled",
"Transitions": [
{ "Days": 30, "StorageClass": "STANDARD_IA" }, // $0.0125/GB
{ "Days": 180, "StorageClass": "GLACIER_IR" } // $0.004/GB
]
}
]
}
This reduces our 5-year storage cost from ~$50K/year (all Standard) to ~$12K/year (tiered). A 75% cost reduction with no impact on active pastes.
Compression
Text compresses extremely well. Using gzip or zstd before storing to S3:
- Average compression ratio for code/text: 3–5×
- 10 KB average paste → ~2.5 KB after compression
- 182 TB → ~45 TB effective storage over 5 years
- The API server compresses before upload and decompresses on read. The extra CPU is negligible compared to the storage savings.
Caching Strategy
With a 5:1 read:write ratio and Zipfian access patterns (a few pastes get most views), caching is critical.
Cache Layer 1: Redis
A Redis cluster sits between the API servers and the database/S3. We cache two things:
- Paste metadata: key → serialized metadata (small, ~500 bytes)
- Paste content: key → compressed content (avg ~2.5 KB after gzip)
// Redis key scheme
paste:meta:{paste_id} → JSON metadata TTL: 1 hour
paste:content:{paste_id} → gzipped content TTL: 1 hour
// Cache-aside pattern (read path):
content = redis.get("paste:content:" + pasteId)
if content == null:
metadata = db.query("SELECT * FROM pastes WHERE id = ?", pasteId)
content = s3.getObject(metadata.content_key)
redis.setex("paste:content:" + pasteId, 3600, content)
redis.setex("paste:meta:" + pasteId, 3600, metadata)
return content
Cache Sizing
How much Redis memory do we need? Following the 80/20 rule: 20% of pastes account for 80% of reads.
- Daily unique pastes read: ~50M / 5 (assume each hot paste is read 5 times) = ~10M unique pastes
- Cache the top 20%: 2M pastes × (500 bytes metadata + 2.5 KB content) = ~6 GB
- A single Redis node (64 GB) can hold 10× this easily
- We'll run a 3-node Redis cluster for high availability (primary + 2 replicas)
Cache Layer 2: CDN
For public, non-expiring pastes, the CDN (CloudFront/Cloudflare) caches the rendered HTML response at edge nodes:
// CDN cache headers for public pastes
Cache-Control: public, max-age=300, s-maxage=3600
Vary: Accept-Encoding
// CDN cache headers for private/unlisted pastes
Cache-Control: private, no-store
// (CDN will not cache these)
- Public pastes: cached for 1 hour at the CDN edge (s-maxage=3600)
- Unlisted pastes: cached briefly (max-age=300) — accessible by URL but not indexed
- Private pastes: never cached at the CDN (no-store)
Cache Invalidation
When a paste is deleted or updated:
- Delete from Redis:
redis.del("paste:meta:" + id, "paste:content:" + id) - Purge from CDN:
cloudfront.createInvalidation("/" + pasteId) - For expired pastes, the cache TTL naturally evicts them — no active invalidation needed
Paste Expiration
Expiration is essential to prevent unbounded storage growth. Our 182 TB estimate assumes no expiration — with it, actual storage will be significantly less.
Lazy Expiration (Read-Time Check)
On every read, check if the paste has expired:
metadata = fetchPasteMetadata(pasteId)
if metadata.expires_at != null && metadata.expires_at < now():
return 404 "Paste expired"
// Serve the paste normally
This is simple and catches all expired pastes on access. But it doesn't reclaim storage — expired pastes still sit in S3 and the database until cleaned up.
Active Expiration (Background Cleanup)
A background Cleanup Worker runs periodically (every 5 minutes) to find and delete expired pastes:
-- Find expired pastes in batches
SELECT id, content_key FROM pastes
WHERE expires_at IS NOT NULL
AND expires_at < NOW()
AND deleted_at IS NULL
ORDER BY expires_at ASC
LIMIT 1000;
-- For each expired paste:
-- 1. Delete from S3: s3.deleteObject(content_key)
-- 2. Delete from Redis: redis.del("paste:meta:" + id, "paste:content:" + id)
-- 3. Soft-delete in DB: UPDATE pastes SET deleted_at = NOW() WHERE id = ?
Why Soft Delete?
We use soft deletes (deleted_at timestamp) instead of hard deletes because:
- Audit trail: We can track when and why pastes were removed
- Abuse investigation: Deleted pastes can be reviewed for policy violations
- Undo: Accidental deletions can be reversed within a grace period
- A separate purge job hard-deletes rows older than 30 days from the
deleted_attimestamp
TTL in Redis
When caching a paste with an expiration, set the Redis TTL to min(1 hour, time_until_expiry). This ensures the cache never serves stale content past the paste's expiration.
Scaling
Let's scale each component for our target: 10M writes/day, 50M reads/day, with 3× peak bursts.
API Servers
- Stateless — scale horizontally behind a load balancer (ALB/NLB)
- Each server handles ~500 requests/s → need 4–5 servers at peak (1,740 reads + 350 writes = ~2,100 req/s)
- Auto-scaling group: min 3, max 10, scale on CPU (70%) or request count
Database Scaling
PostgreSQL with read replicas handles our workload comfortably:
- Write path: Single primary handles 350 writes/s easily (PostgreSQL can handle 10K+ writes/s)
- Read path: 3–5 read replicas with connection pooling (PgBouncer). Each replica handles ~1,000 reads/s
- Connection pooling: PgBouncer limits active connections to prevent overwhelming the database
Database Sharding (When Needed)
When the metadata table exceeds ~1 TB (roughly after 2 years), shard by paste ID:
// Shard assignment: hash the paste ID
shard_number = hash(paste_id) % num_shards
// With 16 shards:
// Each shard holds ~1.14B pastes (after 5 years)
// Each shard is ~562 GB — comfortable for a single PostgreSQL instance
// The first 2 characters of the paste ID provide natural sharding:
// shard = first_2_chars(paste_id) % num_shards
S3 Scaling
S3 scales automatically — it's one of the most scalable services in AWS. With our prefix-based sharding (first 2 characters), we distribute load across thousands of partitions. No action needed.
Redis Scaling
- Start with a single Redis primary + 2 replicas (Redis Cluster mode)
- At scale, use Redis Cluster with 6+ nodes and data sharding across slots
- Key distribution is naturally uniform (Base62 paste IDs)
CDN Scaling
The CDN absorbs the bulk of read traffic. For popular pastes (e.g., shared on social media), the CDN can serve millions of requests/second from edge nodes without any load on our origin servers.
Geographic Distribution
For global users, deploy in multiple regions:
Region Components Purpose
────────────────────────────────────────────────────────
us-east-1 Full stack Primary region
eu-west-1 API + Redis + DB European users
ap-southeast-1 API + Redis + DB Asian users
────────────────────────────────────────────────────────
S3: Cross-region replication to all regions
CDN: Global edge network (200+ PoPs)
Abuse Prevention
Pastebin services are notoriously abused for malware distribution, credential dumps, phishing pages, and spam. A robust abuse prevention system is critical.
Rate Limiting
Apply rate limits at multiple levels:
| Level | Limit | Implementation |
|---|---|---|
| IP-based | 10 creates/hour | Redis sliding window counter |
| User-based | 100 creates/hour | Token bucket per API key |
| Global | 50K creates/hour | Circuit breaker at load balancer |
| Read rate | 300 reads/min per IP | CDN-level WAF rule |
Spam Detection
- Content hashing: Hash each paste and check against known-spam hashes. Block duplicates of known malicious content
- Link density: Flag pastes with an unusually high ratio of URLs to text — a strong indicator of SEO spam
- Keyword filters: Block or flag pastes containing known phishing patterns, credential formats, or malware signatures
- ML classifier: At scale, train a classifier on reported-spam pastes to auto-flag new ones
Content Moderation
// Moderation pipeline
1. User creates paste
2. Paste is stored and available immediately (optimistic)
3. Async: Content is sent to moderation queue
4. Automated checks run:
a. Regex patterns for credentials, SSNs, credit cards
b. URL reputation check (Google Safe Browsing API)
c. Spam classifier score
5. If flagged → paste is hidden, creator notified
6. If score is borderline → human review queue
7. If clean → no action needed
CAPTCHA
Anonymous paste creation requires a CAPTCHA (reCAPTCHA v3 or hCaptcha) to prevent automated spam bots. Authenticated users with good history bypass CAPTCHA.
Reporting
Every paste page includes a "Report Abuse" button. Reports go into a moderation queue with priority based on the number of unique reporters and the paste's view count.
Additional Considerations
Syntax Highlighting
Syntax highlighting is done client-side using a library like Prism.js or highlight.js. The server stores raw text — the browser renders the highlighted version. This keeps the server stateless and avoids storing rendered HTML.
// Client-side rendering
<pre><code class="language-python">
{{ raw paste content }}
</code></pre>
<script src="prism.js"></script>
Analytics
Track per-paste view counts without hitting the database on every read:
- Increment view count in Redis:
INCR paste:views:{id} - Flush to database in batches (every 5 minutes or every 100 increments)
- This decouples the hot read path from database writes
Search
For public paste search, use Elasticsearch:
- Index paste metadata (title, language, tags) and optionally the first 1 KB of content
- Full-text search with language-aware analyzers
- Only public pastes are indexed — unlisted and private are excluded
Monitoring & Alerting
Key metrics to monitor:
- Create latency P99: Alert if > 500ms (target: < 200ms)
- Read latency P99: Alert if > 300ms (target: < 100ms for cached)
- Cache hit rate: Alert if < 70% (target: > 85%)
- S3 error rate: Alert if > 0.1%
- Expired pastes queue depth: Alert if cleanup worker falls behind by > 100K pastes
- Abuse reports/hour: Alert if spike > 3× normal rate
Complete Architecture
Putting it all together, here's the complete architecture with all components and their interactions:
┌──────────────────────────────────────────────┐
│ CLIENTS │
│ Web Browser │ CLI Tool │ API Client │
└──────────┬─────────────────┬───────────────┘
│ │
┌──────────▼─────────────────▼───────────────┐
│ CDN (CloudFront) │
│ • Caches public paste responses │
│ • WAF rules for rate limiting │
│ • SSL termination │
│ • 200+ global edge locations │
└──────────────────┬─────────────────────────┘
│ cache miss
┌──────────────────▼─────────────────────────┐
│ Load Balancer (ALB) │
│ • Health checks on API servers │
│ • SSL offloading │
│ • Sticky sessions (optional) │
└────┬──────────┬──────────┬─────────────────┘
│ │ │
┌───────▼──┐ ┌─────▼────┐ ┌──▼───────┐
│ API Srv 1│ │ API Srv 2│ │ API Srv N│
│(stateless│ │(stateless│ │(stateless│
└───┬──┬───┘ └──┬──┬────┘ └──┬──┬────┘
│ │ │ │ │ │
┌─────────────┘ │ │ │ │ └────────────┐
│ │ │ │ │ │
┌──────▼──────┐ ┌──────▼────────▼──▼─────────▼──────┐ ┌─────▼──────┐
│ Key Gen Svc │ │ Redis Cluster │ │ Amazon S3 │
│ │ │ • paste:meta:{id} │ │ │
│ Pre-gen pool│ │ • paste:content:{id} │ │ /shard/id │
│ of 8-char │ │ • paste:views:{id} │ │ gzipped │
│ Base62 keys │ │ • ratelimit:{ip} │ │ content │
└─────────────┘ └──────────────┬─────────────────────┘ └────────────┘
│ cache miss
┌──────────────▼─────────────────────┐
│ PostgreSQL (Primary) │
│ • pastes table │
│ • users table │
└──────────┬─────────────────────────┘
┌──────────▼─────────────────────────┐
│ Read Replicas (3-5) │
│ • Serve all read queries │
│ • Async replication from primary │
└────────────────────────────────────┘
┌──────────────────────────────────────────────────────────────────────┐
│ BACKGROUND WORKERS │
│ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ ┌────────────┐ │
│ │ Cleanup │ │ View Count │ │ Content │ │ Analytics │ │
│ │ Worker │ │ Flusher │ │ Moderator │ │ Aggregator │ │
│ │ (expire │ │ (Redis → │ │ (spam/abuse │ │ (metrics │ │
│ │ pastes) │ │ Postgres) │ │ detection) │ │ pipeline) │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ └────────────┘ │
└──────────────────────────────────────────────────────────────────────┘
Data Flow Summary
| Operation | Components Involved | Latency Target |
|---|---|---|
| Create paste | API → KGS → S3 → PostgreSQL → Redis | P99 < 200ms |
| Read (CDN hit) | CDN edge node only | P99 < 20ms |
| Read (cache hit) | API → Redis | P99 < 50ms |
| Read (cache miss) | API → Redis (miss) → PostgreSQL → S3 → Redis | P99 < 300ms |
| Delete paste | API → PostgreSQL → Redis → CDN purge | P99 < 100ms |
Trade-Offs & Alternatives
SQL vs DynamoDB for Metadata
We chose PostgreSQL for its rich query capabilities (admin dashboards, analytics, complex filters). DynamoDB would give us automatic sharding and single-digit-ms latency but sacrifices ad-hoc queries. At our scale (116 writes/s average), PostgreSQL is more than adequate.
S3 vs Cassandra for Content
Some designs store paste content in Cassandra (key → blob). This gives sub-10ms reads (faster than S3's 50–100ms) but at much higher operational cost. With Redis caching absorbing 85%+ of reads, the S3 latency is invisible to most users.
Separate Content vs Inline
For very small pastes (< 1 KB), storing content directly in the database row (inline) would be faster and simpler. A hybrid approach — inline for < 1 KB, S3 for everything else — optimizes both cases:
if paste.size < 1024:
metadata.inline_content = paste.content // store in DB
metadata.content_key = null
else:
s3.putObject(content_key, paste.content) // store in S3
metadata.inline_content = null
metadata.content_key = content_key
Encryption at Rest
All S3 content should be encrypted (SSE-S3 or SSE-KMS). For password-protected pastes, consider additional application-layer encryption using the user's password as a key derivation input (AES-256-GCM).
Eventual Consistency Implications
With read replicas and cache layers, a newly created paste might not be immediately visible:
- Mitigation 1: After creating a paste, return the full paste data in the response — the client can display it immediately
- Mitigation 2: Write to Redis on create (cache warming) — subsequent reads from any server hit the cache
- Mitigation 3: For the creator's own requests, read from the primary database (read-your-writes consistency)
Summary
| Component | Technology | Purpose |
|---|---|---|
| CDN | CloudFront / Cloudflare | Cache public pastes at the edge |
| Load Balancer | AWS ALB | Distribute traffic to API servers |
| API Servers | Go / Java / Node.js | Stateless request handling |
| Key Generation | Custom service + DB | Pre-generated unique 8-char keys |
| Cache | Redis Cluster | Hot paste metadata + content |
| Metadata DB | PostgreSQL + replicas | Paste metadata, user accounts |
| Content Store | Amazon S3 | Paste content (gzipped) |
| Background Workers | Custom services | Expiration cleanup, moderation, analytics |