System Design Interview Framework & Cheat Sheet
This is it — the final post in our 70-part High Level Design series. Over the course of this journey we have covered foundations, building blocks, data storage engines, distributed systems theory, architecture patterns, and dozens of real-world case studies. This post ties everything together into one actionable reference you can use the night before — or morning of — your system design interview.
1 · The 4-Step Framework
Every system design interview — from FAANG to startups — follows roughly the same shape. The biggest differentiator between candidates who pass and those who don't is structure. Having a repeatable framework keeps you on track, ensures you cover all the bases, and signals to the interviewer that you think methodically about complex problems.
- Ask clarifying questions — never assume
- Define functional requirements (what the system does)
- Define non-functional requirements (latency, availability, consistency, scale)
- Establish scope boundaries — what is out of scope?
- Confirm with interviewer: "Does this scope sound right?"
- Estimate QPS (queries per second) — read vs. write
- Calculate storage needs over 5 years
- Estimate bandwidth (ingress / egress)
- Estimate memory for caching (80/20 rule)
- Use round numbers — precision doesn't matter, order of magnitude does
- Draw the architecture diagram — clients, LB, services, DB, cache, queue
- Define API contracts (REST/gRPC endpoints)
- Design the data model — tables, schemas, key choices
- Walk through the core user flows end-to-end
- Get interviewer buy-in before deep dive
- Pick 2–3 components to go deep on (interviewer may guide you)
- Discuss trade-offs for each decision (SQL vs. NoSQL, push vs. pull, etc.)
- Handle edge cases — failures, hot spots, race conditions
- Add scaling strategies — sharding, replication, caching layers
- Mention monitoring & observability
Step 1: Understand Requirements — In Depth
The first five minutes of a system design interview determine its trajectory. Candidates who jump straight to drawing boxes fail because they solve the wrong problem. Here is a structured approach to requirement gathering:
Functional requirements define what the system does — the user-visible features. Ask:
- Who are the users? (consumers, businesses, internal teams)
- What are the core use cases? (e.g., "Users can post a tweet and see a feed of tweets")
- Any features explicitly out of scope? (e.g., "Don't worry about DMs for now")
Non-functional requirements define how well the system performs — the quality attributes:
| Attribute | Key Question | Example Answer |
|---|---|---|
| Scale | How many users / requests? | 500M DAU, 10K writes/sec |
| Latency | How fast must responses be? | p99 < 200ms for reads |
| Availability | What uptime is required? | 99.99% (≈52 min downtime/year) |
| Consistency | Eventual or strong? | Eventual for feed, strong for payments |
| Durability | Can we lose data? | Zero data loss for messages |
Step 2: Back-of-Envelope Estimation — In Depth
Estimations ground your design in reality. You don't need exact numbers — you need the right order of magnitude so you can justify architectural choices (e.g., "We need sharding because a single MySQL can't handle 50K writes/sec").
Template for estimation:
// Given: 500M DAU, each user posts 2 tweets/day and reads 100 tweets/day
Write QPS = 500M × 2 / 86400 ≈ 12,000 writes/sec
Read QPS = 500M × 100 / 86400 ≈ 580,000 reads/sec
Read:Write = ~50:1 (read-heavy → optimize for reads)
Storage (5 years):
Each tweet ≈ 250 bytes (text) + 500 bytes (metadata) = 750 bytes
500M × 2 tweets/day × 365 × 5 years × 750 B ≈ 1.37 PB
Bandwidth:
Write: 12,000 × 750 B ≈ 9 MB/s ingress
Read: 580,000 × 750 B ≈ 435 MB/s egress
Cache (80/20 rule — cache 20% of daily reads):
Daily reads = 500M × 100 × 750 B = 37.5 TB/day
Cache 20% = ~7.5 TB → need a distributed cache cluster (Redis)
Step 3: High-Level Design — In Depth
This is the core of the interview. Draw a clean architecture diagram with these standard building blocks:
- Clients — mobile, web, API consumers
- Load Balancer — distributes traffic across application servers
- Application Servers — stateless service tier (horizontally scalable)
- Cache Layer — Redis/Memcached for hot data
- Database — primary data store (SQL or NoSQL)
- Message Queue — async processing (Kafka, RabbitMQ)
- CDN — static assets, media delivery
- Object Storage — S3 for blobs, images, videos
Define API contracts clearly — even brief REST endpoints show you think about interfaces:
POST /api/v1/tweets
Body: { "text": "Hello world", "media_ids": ["abc123"] }
Response: 201 { "tweet_id": "xyz789", "created_at": "..." }
GET /api/v1/feed?user_id=123&cursor=abc&limit=20
Response: 200 { "tweets": [...], "next_cursor": "def" }
Step 4: Deep Dive — In Depth
This is where senior candidates separate themselves. Pick the most interesting or challenging components and go deep:
- Database choice & schema design — Why SQL over NoSQL (or vice versa)? How will you shard? What indexes?
- Caching strategy — Write-through vs. write-behind vs. cache-aside? Eviction policy? Cache invalidation?
- Consistency model — Where is eventual consistency acceptable? Where do you need strong consistency?
- Failure handling — What if a service goes down? Circuit breakers? Retries with exponential backoff? Saga pattern for distributed transactions?
- Scaling bottlenecks — Where will the system hit limits first? How do you scale past them?
2 · Estimation Cheat Sheet
Powers of 2
Memorize these — they are the building blocks of every back-of-envelope calculation:
| Power | Exact Value | Approx | Name |
|---|---|---|---|
| 210 | 1,024 | ~1 Thousand | 1 KB |
| 216 | 65,536 | ~65 Thousand | 64 KB |
| 220 | 1,048,576 | ~1 Million | 1 MB |
| 230 | 1,073,741,824 | ~1 Billion | 1 GB |
| 232 | 4,294,967,296 | ~4 Billion | 4 GB (max 32-bit addr) |
| 240 | 1,099,511,627,776 | ~1 Trillion | 1 TB |
| 250 | 1,125,899,906,842,624 | ~1 Quadrillion | 1 PB |
Latency Numbers Every Engineer Should Know
These numbers, originally compiled by Jeff Dean, help you reason about where time is spent in a system:
| Operation | Latency | Notes |
|---|---|---|
| L1 cache reference | 0.5 ns | Fastest memory access |
| Branch mispredict | 5 ns | CPU pipeline stall |
| L2 cache reference | 7 ns | ~14× L1 |
| Mutex lock/unlock | 25 ns | Thread synchronization |
| Main memory reference (RAM) | 100 ns | ~200× L1 |
| Compress 1 KB (Snappy) | 3 μs | 3,000 ns |
| Send 1 KB over 1 Gbps network | 10 μs | Network I/O starts here |
| Read 4 KB randomly from SSD | 150 μs | ~1,000× RAM |
| Read 1 MB sequentially from memory | 250 μs | Fast bulk read |
| Round trip within same datacenter | 500 μs | 0.5 ms |
| Read 1 MB sequentially from SSD | 1 ms | SSD sequential is fast |
| HDD seek | 10 ms | Avoid random HDD reads |
| Read 1 MB sequentially from HDD | 20 ms | ~80× slower than SSD |
| Send packet CA → Netherlands → CA | 150 ms | Speed of light limit |
Common Conversions
| What | Conversion |
|---|---|
| 1 day | ≈ 86,400 seconds (round to 100K for estimation) |
| 1 million requests/day | ≈ 12 requests/second |
| 1 billion requests/day | ≈ 12,000 requests/second |
| 1 million requests/sec peak | Assume 2–3× average QPS for peak |
| 100M users × 1 KB each | ≈ 100 GB |
| Availability: 99.9% | ≈ 8.76 hours downtime/year |
| Availability: 99.99% | ≈ 52.6 minutes downtime/year |
| Availability: 99.999% | ≈ 5.26 minutes downtime/year |
| 1 character (UTF-8) | ≈ 1–4 bytes (assume 1 for English) |
| UUID / GUID | ≈ 128 bits = 16 bytes |
3 · Component Cheat Sheet
One of the most common questions in a design interview is "Why did you choose this component?" Here is a quick reference for when to use each building block:
🔀 Load Balancer
When: Multiple app server instances. Options: L4 (TCP) for raw throughput, L7 (HTTP) for content-based routing. Use round-robin, least-connections, or consistent hashing.
🌐 CDN
When: Serving static assets (images, JS, CSS) or media to global users. Reduces latency via edge caching. Push CDN for predictable content, pull CDN for dynamic.
⚡ Cache (Redis / Memcached)
When: Read-heavy workloads, expensive DB queries, session storage. Cache-aside for general use, write-through for consistency, write-behind for write-heavy.
📨 Message Queue (Kafka)
When: Async processing, decoupling services, event-driven architectures, buffering spikes. Kafka for event streaming, RabbitMQ for task queues.
🗄️ SQL Database
When: ACID transactions, complex queries/joins, structured relational data. PostgreSQL/MySQL. Good up to ~10K writes/sec per node with read replicas.
📦 NoSQL Database
When: Massive scale, flexible schema, high write throughput. Key-value for simple lookups, document for nested data, wide-column for time-series.
🔍 Search (Elasticsearch)
When: Full-text search, autocomplete, log analytics. Inverted index for fast text lookups. Elasticsearch/Solr as secondary index alongside primary DB.
🪣 Object Storage (S3)
When: Storing blobs — images, videos, files, backups. Virtually unlimited scale, pay-per-use. Combine with CDN for fast delivery.
🛡️ Rate Limiter
When: Protecting APIs from abuse, DDoS mitigation, fair usage enforcement. Token bucket for bursty traffic, sliding window for precision.
🚪 API Gateway
When: Single entry point for microservices. Handles auth, rate limiting, routing, SSL termination, request transformation. Often combined with service discovery.
🔐 Distributed Lock
When: Mutual exclusion across services — leader election, preventing double-processing. Redis Redlock or ZooKeeper. Use sparingly — locks reduce throughput.
📊 Monitoring & Logging
When: Always. Every design should mention metrics, logs, and traces. Prometheus + Grafana for metrics, ELK for logs, Jaeger for distributed tracing.
4 · Pattern-to-Problem Mapping
This table maps common system design problems to the key patterns and concepts they test. Use it to identify which patterns are relevant when you hear a problem statement:
| System | Key Patterns & Concepts | Series Reference |
|---|---|---|
| URL Shortener | Base62 encoding, Key Generation Service (KGS), consistent hashing, read-heavy caching | Easy |
| Pastebin | Object storage (S3) for content, metadata in SQL, unique ID generation, TTL-based expiry | Easy |
| Rate Limiter | Token bucket / sliding window, distributed counter (Redis), API gateway integration | Easy |
| Chat System | WebSocket connections, fan-out, presence service, message queues, last-seen tracking, delivery receipts | Medium |
| News Feed | Fan-out-on-write (push) vs. fan-out-on-read (pull), hybrid approach, ranked feed cache, celebrity problem | Medium |
| Notification System | Priority queues, event-driven architecture, rate limiting per user, multi-channel delivery (push/SMS/email) | Medium |
| Twitter / Social Media | Feed generation (fan-out hybrid), sharding by user_id, celebrity handling, search index, trending topics | Medium |
| Instagram / Photo Sharing | Object storage + CDN for images, timeline generation, image processing pipeline, Bloom filters for duplicate detection | Medium |
| YouTube / Video Streaming | Adaptive bitrate streaming (HLS/DASH), transcoding pipeline, CDN distribution, recommendation engine, chunked object storage | Hard |
| Uber / Ride Sharing | Geospatial indexing (QuadTree / S2), real-time matching, Kafka for event streaming, ETA calculation, surge pricing | Hard |
| Google Maps | Graph algorithms (Dijkstra/A*), map tiling, geospatial storage, CDN for tiles, real-time traffic aggregation | Hard |
| Distributed Key-Value Store | Consistent hashing, vector clocks, gossip protocol, quorum reads/writes, anti-entropy | Hard |
| Stock Exchange | Order matching engine, event sourcing / CQRS, ultra-low latency, sequencer, deterministic replay | Expert |
| Payment System | Saga pattern, idempotency keys, 2PC, reconciliation, double-entry ledger, PCI compliance | Expert |
| Search Engine | Inverted index, web crawler, PageRank, distributed indexing, query parsing, spell correction | Expert |
5 · Common Mistakes
After conducting hundreds of mock interviews, these are the most frequent mistakes candidates make — and how to avoid each one:
- ❌ Not asking clarifying questions. Jumping to design immediately signals you don't think critically. Always spend the first 3–5 minutes understanding the problem. Even if you think you know what "Design Twitter" means, ask: "Should we focus on the feed, the posting, search, or DMs?"
- ❌ Jumping to the solution. Starting with "Let's use Kafka and Redis" before defining what the system needs to do. Always requirements first, then design. The interviewer wants to see your thought process, not just the answer.
- ❌ Over-engineering. Adding microservices, CQRS, event sourcing, and a service mesh for a system that handles 100 requests/day. Match the complexity of your architecture to the scale of the problem. A monolith is a valid answer for many scales.
- ❌ Ignoring non-functional requirements. Designing a beautiful architecture that doesn't meet the latency, availability, or consistency requirements. Non-functionals drive the architecture — not the other way around.
- ❌ Not discussing trade-offs. Every decision has pros and cons. Saying "I'll use NoSQL" without explaining why (e.g., "because we need flexible schema and high write throughput at the cost of weaker consistency") is a missed opportunity to show depth.
- ❌ Single point of failure. Every component in your design should be redundant. If your load balancer, database, or cache has a single instance, call it out and explain how you'd add redundancy.
- ❌ Not mentioning monitoring. A production system without monitoring, alerting, and logging is incomplete. Even a one-sentence mention ("We'd add Prometheus metrics and distributed tracing with Jaeger") shows production awareness.
6 · Red Flags Interviewers Watch For
Beyond common mistakes, these are the behaviors that actively lower your interview score. Interviewers are trained to spot them:
- 🚩 No requirements gathering at all. Immediately drawing boxes without a single question. This suggests you memorized a solution rather than thinking from first principles. Interviewers often intentionally leave the problem vague to test this.
- 🚩 Monolithic thinking. Putting everything in one giant server. Even if you start with a monolith (which is fine!), you should articulate how you'd scale it — which parts would you extract, and at what trigger points.
- 🚩 Ignoring failure modes. A design that only works when everything is up. The interviewer will ask "What happens when X goes down?" and you should have thought about it. Mention circuit breakers, retries, fallbacks, and replication.
- 🚩 Magic numbers without estimation. Saying "We need 100 servers" without showing the math. Even a rough calculation shows rigor: "At 12K writes/sec and ~1K writes/sec per server, we need ~12 servers plus buffer."
- 🚩 No trade-off discussion. Presenting a design as though every choice is obvious. Real engineering is about trade-offs. The interviewer wants to hear: "I chose X over Y because…" with clear reasoning.
- 🚩 Buzzword dropping. Saying "blockchain," "machine learning," or "serverless" without understanding how they apply. It's better to use fewer technologies well than to name-drop technologies you can't defend.
- 🚩 Not adapting to feedback. If the interviewer hints that a component won't scale or suggests a different approach, don't stubbornly stick to your original design. Adaptability is a key signal.
- 🚩 No API or data model discussion. Talking only in abstract boxes without defining what data flows between them. Even simple REST endpoints and table schemas show you think concretely about interfaces.
7 · Topic Difficulty Tiers
Not all system design problems are equal. Here's how they break down by difficulty, so you can prioritize your study time and know what to expect at different interview levels:
Easy — Warm-up & Phone Screens
These problems have well-known solutions and test whether you can articulate basic system design concepts clearly. Expected in phone screens and junior/mid-level interviews.
| Problem | Core Concepts Tested | Time to Prep |
|---|---|---|
| URL Shortener | Hashing, base62, KGS, read-heavy caching, redirection | 2–3 hours |
| Pastebin | Object storage, metadata DB, TTL, unique IDs | 2–3 hours |
| Rate Limiter | Token bucket, sliding window, distributed counters | 2–3 hours |
| Key-Value Store (single node) | Hash table, write-ahead log, compaction, in-memory index | 3–4 hours |
Medium — Standard On-site
These are the most common on-site interview questions. They require combining multiple building blocks and making meaningful trade-off decisions.
| Problem | Core Concepts Tested | Time to Prep |
|---|---|---|
| Chat System (WhatsApp) | WebSockets, fan-out, presence, message queues, ordering | 4–6 hours |
| News Feed (Facebook) | Fan-out strategies, ranking, caching, celebrity problem | 4–6 hours |
| Notification System | Event-driven architecture, priority queues, multi-channel | 3–5 hours |
| Feed generation, sharding, trending, search | 5–6 hours | |
| Image pipeline, CDN, timeline, explore/recommendations | 4–6 hours | |
| Web Crawler | BFS/DFS, URL frontier, politeness, deduplication, robots.txt | 4–5 hours |
| Typeahead / Autocomplete | Trie data structure, top-K, distributed tries, ranking | 3–5 hours |
Hard — Senior & Staff-Level
These require deep understanding of distributed systems, data-intensive architectures, and nuanced trade-offs. Expected in senior and staff-level loops.
| Problem | Core Concepts Tested | Time to Prep |
|---|---|---|
| YouTube / Netflix | Video transcoding, adaptive streaming, CDN, recommendations | 6–8 hours |
| Uber / Lyft | Geospatial index, real-time matching, ETA, surge pricing | 6–8 hours |
| Google Maps | Graph algorithms, map tiling, traffic, ETA computation | 6–8 hours |
| Distributed KV Store | Consistent hashing, vector clocks, quorum, gossip | 8–10 hours |
| Slack / Discord | Real-time messaging at scale, presence, channels, search | 6–8 hours |
| Dropbox / Google Drive | File sync, chunking, deduplication, conflict resolution | 6–8 hours |
Expert — Principal & Architect
These problems have no single "right" answer and require deep domain knowledge. Expected in principal/architect interviews or as follow-up deep dives.
| Problem | Core Concepts Tested | Time to Prep |
|---|---|---|
| Payment System | Saga pattern, idempotency, double-entry ledger, PCI | 8–10 hours |
| Stock Exchange | Order matching, event sourcing, ultra-low latency, CQRS | 10–12 hours |
| Search Engine | Web crawling, inverted index, PageRank, distributed indexing | 10–12 hours |
| Ad System (Google Ads) | Real-time bidding, auction, ad serving, click prediction | 8–10 hours |
8 · 8-Week Study Plan
This plan covers the entire 70-post series in a structured 8-week roadmap. Each week builds on the previous one. Aim for 1–2 hours of focused study per day.
📅 Week 1 — Foundations
- System Design Introduction
- Horizontal & Vertical Scaling
- Load Balancing
- Caching Strategies
- Content Delivery Networks
- Database Fundamentals
- API Design
- Practice: Design a URL Shortener
📅 Week 2 — Data & Consistency
- ACID vs. BASE
- CAP Theorem
- Consistent Hashing
- SQL Internals
- NoSQL: Key-Value Stores
- NoSQL: Document Stores
- Practice: Design a Pastebin
📅 Week 3 — Storage & Search
- NoSQL: Wide-Column Stores
- NoSQL: Graph Databases
- Object Storage (S3)
- Time-Series Databases
- Search Engines (Elasticsearch)
- Practice: Design a Rate Limiter
📅 Week 4 — Building Blocks
- Database Sharding
- Replication
- Message Queues & Kafka
- Rate Limiting Algorithms
- Proxies & Reverse Proxies
- Bloom Filters
- Practice: Design a Chat System
📅 Week 5 — Infrastructure
- Service Discovery
- Heartbeat & Health Checks
- Logging & Monitoring
- Distributed Locking
- Practice: Design a News Feed System
- Practice: Design a Notification System
📅 Week 6 — Distributed Systems
- Consensus (Paxos / Raft)
- Leader Election
- Vector Clocks
- Gossip Protocol
- Two-Phase Commit
- Saga Pattern
- Circuit Breaker
- Practice: Design a Distributed KV Store
📅 Week 7 — Architecture Patterns
- Event-Driven Architecture
- CQRS & Event Sourcing
- Domain-Driven Design
- Data Pipelines
- Practice: Design YouTube / Uber
📅 Week 8 — Case Studies & Review
- Review all case studies in the series
- Practice: Payment System (Expert)
- Practice: Stock Exchange (Expert)
- Mock interviews with a partner
- Review this cheat sheet one final time
- 🎯 You're ready!
Animation 1: Interactive Framework Walkthrough
🎬 Walk Through: "Design Twitter"
Click each step to see what you'd say and do during that phase of the interview. A timer shows the suggested time allocation.
Click a step above or press "Next Step" to begin the walkthrough.
Animation 2: Interactive Difficulty Matrix
🎯 System Design Problem Explorer
Click any problem to see its key concepts, difficulty level, and which series posts cover it. Color coded: Easy Medium Hard Expert
Click a problem above to see its details.