High Level Design Series · Case Studies · Post 70 of 70

System Design Interview Framework & Cheat Sheet

April 2026 · 22 min read

This is it — the final post in our 70-part High Level Design series. Over the course of this journey we have covered foundations, building blocks, data storage engines, distributed systems theory, architecture patterns, and dozens of real-world case studies. This post ties everything together into one actionable reference you can use the night before — or morning of — your system design interview.

How to use this post: Bookmark it. Print it. This is your single-page cheat sheet. It contains the 4-step interview framework, estimation tables, component selection guides, pattern-to-problem mappings, common pitfalls, difficulty tiers, and a week-by-week study plan that cross-references every post in the series.

1 · The 4-Step Framework

Every system design interview — from FAANG to startups — follows roughly the same shape. The biggest differentiator between candidates who pass and those who don't is structure. Having a repeatable framework keeps you on track, ensures you cover all the bases, and signals to the interviewer that you think methodically about complex problems.

Understand Requirements

⏱ 3–5 minutes

Ask clarifying questions — never assume
Define functional requirements (what the system does)
Define non-functional requirements (latency, availability, consistency, scale)
Establish scope boundaries — what is out of scope?
Confirm with interviewer: "Does this scope sound right?"

Back-of-Envelope Estimation

⏱ 3–5 minutes

Estimate QPS (queries per second) — read vs. write
Calculate storage needs over 5 years
Estimate bandwidth (ingress / egress)
Estimate memory for caching (80/20 rule)
Use round numbers — precision doesn't matter, order of magnitude does

High-Level Design

⏱ 10–15 minutes

Draw the architecture diagram — clients, LB, services, DB, cache, queue
Define API contracts (REST/gRPC endpoints)
Design the data model — tables, schemas, key choices
Walk through the core user flows end-to-end
Get interviewer buy-in before deep dive

Deep Dive

⏱ 10–15 minutes

Pick 2–3 components to go deep on (interviewer may guide you)
Discuss trade-offs for each decision (SQL vs. NoSQL, push vs. pull, etc.)
Handle edge cases — failures, hot spots, race conditions
Add scaling strategies — sharding, replication, caching layers
Mention monitoring & observability

💡 Pro tip: Announce what step you're on: "Let me start by clarifying the requirements…" This gives the interviewer a mental model of your progress and lets them redirect you if you're spending too long on one area.

Step 1: Understand Requirements — In Depth

The first five minutes of a system design interview determine its trajectory. Candidates who jump straight to drawing boxes fail because they solve the wrong problem. Here is a structured approach to requirement gathering:

Functional requirements define what the system does — the user-visible features. Ask:

Who are the users? (consumers, businesses, internal teams)
What are the core use cases? (e.g., "Users can post a tweet and see a feed of tweets")
Any features explicitly out of scope? (e.g., "Don't worry about DMs for now")

Non-functional requirements define how well the system performs — the quality attributes:

Attribute	Key Question	Example Answer
Scale	How many users / requests?	500M DAU, 10K writes/sec
Latency	How fast must responses be?	p99 < 200ms for reads
Availability	What uptime is required?	99.99% (≈52 min downtime/year)
Consistency	Eventual or strong?	Eventual for feed, strong for payments
Durability	Can we lose data?	Zero data loss for messages

Step 2: Back-of-Envelope Estimation — In Depth

Estimations ground your design in reality. You don't need exact numbers — you need the right order of magnitude so you can justify architectural choices (e.g., "We need sharding because a single MySQL can't handle 50K writes/sec").

Template for estimation:

// Given: 500M DAU, each user posts 2 tweets/day and reads 100 tweets/day

Write QPS  = 500M × 2 / 86400 ≈ 12,000 writes/sec
Read QPS   = 500M × 100 / 86400 ≈ 580,000 reads/sec
Read:Write = ~50:1 (read-heavy → optimize for reads)

Storage (5 years):
  Each tweet ≈ 250 bytes (text) + 500 bytes (metadata) = 750 bytes
  500M × 2 tweets/day × 365 × 5 years × 750 B ≈ 1.37 PB

Bandwidth:
  Write: 12,000 × 750 B ≈ 9 MB/s ingress
  Read:  580,000 × 750 B ≈ 435 MB/s egress

Cache (80/20 rule — cache 20% of daily reads):
  Daily reads = 500M × 100 × 750 B = 37.5 TB/day
  Cache 20% = ~7.5 TB → need a distributed cache cluster (Redis)

Step 3: High-Level Design — In Depth

This is the core of the interview. Draw a clean architecture diagram with these standard building blocks:

Clients — mobile, web, API consumers
Load Balancer — distributes traffic across application servers
Application Servers — stateless service tier (horizontally scalable)
Cache Layer — Redis/Memcached for hot data
Database — primary data store (SQL or NoSQL)
Message Queue — async processing (Kafka, RabbitMQ)
CDN — static assets, media delivery
Object Storage — S3 for blobs, images, videos

Define API contracts clearly — even brief REST endpoints show you think about interfaces:

POST /api/v1/tweets
  Body: { "text": "Hello world", "media_ids": ["abc123"] }
  Response: 201 { "tweet_id": "xyz789", "created_at": "..." }

GET  /api/v1/feed?user_id=123&cursor=abc&limit=20
  Response: 200 { "tweets": [...], "next_cursor": "def" }

Step 4: Deep Dive — In Depth

This is where senior candidates separate themselves. Pick the most interesting or challenging components and go deep:

Database choice & schema design — Why SQL over NoSQL (or vice versa)? How will you shard? What indexes?
Caching strategy — Write-through vs. write-behind vs. cache-aside? Eviction policy? Cache invalidation?
Consistency model — Where is eventual consistency acceptable? Where do you need strong consistency?
Failure handling — What if a service goes down? Circuit breakers? Retries with exponential backoff? Saga pattern for distributed transactions?
Scaling bottlenecks — Where will the system hit limits first? How do you scale past them?

💡 Pro tip: Always frame decisions as trade-offs: "We could use fan-out-on-write which gives O(1) reads but costs more storage and write amplification, OR fan-out-on-read which is cheaper but gives slower reads. Given our read-heavy workload, I'd choose fan-out-on-write."

2 · Estimation Cheat Sheet

Powers of 2

Memorize these — they are the building blocks of every back-of-envelope calculation:

Power	Exact Value	Approx	Name
2¹⁰	1,024	~1 Thousand	1 KB
2¹⁶	65,536	~65 Thousand	64 KB
2²⁰	1,048,576	~1 Million	1 MB
2³⁰	1,073,741,824	~1 Billion	1 GB
2³²	4,294,967,296	~4 Billion	4 GB (max 32-bit addr)
2⁴⁰	1,099,511,627,776	~1 Trillion	1 TB
2⁵⁰	1,125,899,906,842,624	~1 Quadrillion	1 PB

Latency Numbers Every Engineer Should Know

These numbers, originally compiled by Jeff Dean, help you reason about where time is spent in a system:

Operation	Latency	Notes
L1 cache reference	0.5 ns	Fastest memory access
Branch mispredict	5 ns	CPU pipeline stall
L2 cache reference	7 ns	~14× L1
Mutex lock/unlock	25 ns	Thread synchronization
Main memory reference (RAM)	100 ns	~200× L1
Compress 1 KB (Snappy)	3 μs	3,000 ns
Send 1 KB over 1 Gbps network	10 μs	Network I/O starts here
Read 4 KB randomly from SSD	150 μs	~1,000× RAM
Read 1 MB sequentially from memory	250 μs	Fast bulk read
Round trip within same datacenter	500 μs	0.5 ms
Read 1 MB sequentially from SSD	1 ms	SSD sequential is fast
HDD seek	10 ms	Avoid random HDD reads
Read 1 MB sequentially from HDD	20 ms	~80× slower than SSD
Send packet CA → Netherlands → CA	150 ms	Speed of light limit

L1 cache0.5ns

L2 cache7ns

RAM100ns

SSD random read150μs

Datacenter RTT500μs

HDD seek10ms

Network RTT150ms

Common Conversions

What	Conversion
1 day	≈ 86,400 seconds (round to 100K for estimation)
1 million requests/day	≈ 12 requests/second
1 billion requests/day	≈ 12,000 requests/second
1 million requests/sec peak	Assume 2–3× average QPS for peak
100M users × 1 KB each	≈ 100 GB
Availability: 99.9%	≈ 8.76 hours downtime/year
Availability: 99.99%	≈ 52.6 minutes downtime/year
Availability: 99.999%	≈ 5.26 minutes downtime/year
1 character (UTF-8)	≈ 1–4 bytes (assume 1 for English)
UUID / GUID	≈ 128 bits = 16 bytes

3 · Component Cheat Sheet

One of the most common questions in a design interview is "Why did you choose this component?" Here is a quick reference for when to use each building block:

🔀 Load Balancer

When: Multiple app server instances. Options: L4 (TCP) for raw throughput, L7 (HTTP) for content-based routing. Use round-robin, least-connections, or consistent hashing.

🌐 CDN

When: Serving static assets (images, JS, CSS) or media to global users. Reduces latency via edge caching. Push CDN for predictable content, pull CDN for dynamic.

⚡ Cache (Redis / Memcached)

When: Read-heavy workloads, expensive DB queries, session storage. Cache-aside for general use, write-through for consistency, write-behind for write-heavy.

📨 Message Queue (Kafka)

When: Async processing, decoupling services, event-driven architectures, buffering spikes. Kafka for event streaming, RabbitMQ for task queues.

🗄️ SQL Database

When: ACID transactions, complex queries/joins, structured relational data. PostgreSQL/MySQL. Good up to ~10K writes/sec per node with read replicas.

📦 NoSQL Database

When: Massive scale, flexible schema, high write throughput. Key-value for simple lookups, document for nested data, wide-column for time-series.

🔍 Search (Elasticsearch)

When: Full-text search, autocomplete, log analytics. Inverted index for fast text lookups. Elasticsearch/Solr as secondary index alongside primary DB.

🪣 Object Storage (S3)

When: Storing blobs — images, videos, files, backups. Virtually unlimited scale, pay-per-use. Combine with CDN for fast delivery.

🛡️ Rate Limiter

When: Protecting APIs from abuse, DDoS mitigation, fair usage enforcement. Token bucket for bursty traffic, sliding window for precision.

🚪 API Gateway

When: Single entry point for microservices. Handles auth, rate limiting, routing, SSL termination, request transformation. Often combined with service discovery.

🔐 Distributed Lock

When: Mutual exclusion across services — leader election, preventing double-processing. Redis Redlock or ZooKeeper. Use sparingly — locks reduce throughput.

📊 Monitoring & Logging

When: Always. Every design should mention metrics, logs, and traces. Prometheus + Grafana for metrics, ELK for logs, Jaeger for distributed tracing.

4 · Pattern-to-Problem Mapping

This table maps common system design problems to the key patterns and concepts they test. Use it to identify which patterns are relevant when you hear a problem statement:

System	Key Patterns & Concepts	Series Reference
URL Shortener	Base62 encoding, Key Generation Service (KGS), consistent hashing, read-heavy caching	Easy
Pastebin	Object storage (S3) for content, metadata in SQL, unique ID generation, TTL-based expiry	Easy
Rate Limiter	Token bucket / sliding window, distributed counter (Redis), API gateway integration	Easy
Chat System	WebSocket connections, fan-out, presence service, message queues, last-seen tracking, delivery receipts	Medium
News Feed	Fan-out-on-write (push) vs. fan-out-on-read (pull), hybrid approach, ranked feed cache, celebrity problem	Medium
Notification System	Priority queues, event-driven architecture, rate limiting per user, multi-channel delivery (push/SMS/email)	Medium
Twitter / Social Media	Feed generation (fan-out hybrid), sharding by user_id, celebrity handling, search index, trending topics	Medium
Instagram / Photo Sharing	Object storage + CDN for images, timeline generation, image processing pipeline, Bloom filters for duplicate detection	Medium
YouTube / Video Streaming	Adaptive bitrate streaming (HLS/DASH), transcoding pipeline, CDN distribution, recommendation engine, chunked object storage	Hard
Uber / Ride Sharing	Geospatial indexing (QuadTree / S2), real-time matching, Kafka for event streaming, ETA calculation, surge pricing	Hard
Google Maps	Graph algorithms (Dijkstra/A*), map tiling, geospatial storage, CDN for tiles, real-time traffic aggregation	Hard
Distributed Key-Value Store	Consistent hashing, vector clocks, gossip protocol, quorum reads/writes, anti-entropy	Hard
Stock Exchange	Order matching engine, event sourcing / CQRS, ultra-low latency, sequencer, deterministic replay	Expert
Payment System	Saga pattern, idempotency keys, 2PC, reconciliation, double-entry ledger, PCI compliance	Expert
Search Engine	Inverted index, web crawler, PageRank, distributed indexing, query parsing, spell correction	Expert

5 · Common Mistakes

After conducting hundreds of mock interviews, these are the most frequent mistakes candidates make — and how to avoid each one:

❌ Not asking clarifying questions. Jumping to design immediately signals you don't think critically. Always spend the first 3–5 minutes understanding the problem. Even if you think you know what "Design Twitter" means, ask: "Should we focus on the feed, the posting, search, or DMs?"
❌ Jumping to the solution. Starting with "Let's use Kafka and Redis" before defining what the system needs to do. Always requirements first, then design. The interviewer wants to see your thought process, not just the answer.
❌ Over-engineering. Adding microservices, CQRS, event sourcing, and a service mesh for a system that handles 100 requests/day. Match the complexity of your architecture to the scale of the problem. A monolith is a valid answer for many scales.
❌ Ignoring non-functional requirements. Designing a beautiful architecture that doesn't meet the latency, availability, or consistency requirements. Non-functionals drive the architecture — not the other way around.
❌ Not discussing trade-offs. Every decision has pros and cons. Saying "I'll use NoSQL" without explaining why (e.g., "because we need flexible schema and high write throughput at the cost of weaker consistency") is a missed opportunity to show depth.
❌ Single point of failure. Every component in your design should be redundant. If your load balancer, database, or cache has a single instance, call it out and explain how you'd add redundancy.
❌ Not mentioning monitoring. A production system without monitoring, alerting, and logging is incomplete. Even a one-sentence mention ("We'd add Prometheus metrics and distributed tracing with Jaeger") shows production awareness.

6 · Red Flags Interviewers Watch For

Beyond common mistakes, these are the behaviors that actively lower your interview score. Interviewers are trained to spot them:

🚩 No requirements gathering at all. Immediately drawing boxes without a single question. This suggests you memorized a solution rather than thinking from first principles. Interviewers often intentionally leave the problem vague to test this.
🚩 Monolithic thinking. Putting everything in one giant server. Even if you start with a monolith (which is fine!), you should articulate how you'd scale it — which parts would you extract, and at what trigger points.
🚩 Ignoring failure modes. A design that only works when everything is up. The interviewer will ask "What happens when X goes down?" and you should have thought about it. Mention circuit breakers, retries, fallbacks, and replication.
🚩 Magic numbers without estimation. Saying "We need 100 servers" without showing the math. Even a rough calculation shows rigor: "At 12K writes/sec and ~1K writes/sec per server, we need ~12 servers plus buffer."
🚩 No trade-off discussion. Presenting a design as though every choice is obvious. Real engineering is about trade-offs. The interviewer wants to hear: "I chose X over Y because…" with clear reasoning.
🚩 Buzzword dropping. Saying "blockchain," "machine learning," or "serverless" without understanding how they apply. It's better to use fewer technologies well than to name-drop technologies you can't defend.
🚩 Not adapting to feedback. If the interviewer hints that a component won't scale or suggests a different approach, don't stubbornly stick to your original design. Adaptability is a key signal.
🚩 No API or data model discussion. Talking only in abstract boxes without defining what data flows between them. Even simple REST endpoints and table schemas show you think concretely about interfaces.

7 · Topic Difficulty Tiers

Not all system design problems are equal. Here's how they break down by difficulty, so you can prioritize your study time and know what to expect at different interview levels:

Easy — Warm-up & Phone Screens

These problems have well-known solutions and test whether you can articulate basic system design concepts clearly. Expected in phone screens and junior/mid-level interviews.

Problem	Core Concepts Tested	Time to Prep
URL Shortener	Hashing, base62, KGS, read-heavy caching, redirection	2–3 hours
Pastebin	Object storage, metadata DB, TTL, unique IDs	2–3 hours
Rate Limiter	Token bucket, sliding window, distributed counters	2–3 hours
Key-Value Store (single node)	Hash table, write-ahead log, compaction, in-memory index	3–4 hours

Medium — Standard On-site

These are the most common on-site interview questions. They require combining multiple building blocks and making meaningful trade-off decisions.

Problem	Core Concepts Tested	Time to Prep
Chat System (WhatsApp)	WebSockets, fan-out, presence, message queues, ordering	4–6 hours
News Feed (Facebook)	Fan-out strategies, ranking, caching, celebrity problem	4–6 hours
Notification System	Event-driven architecture, priority queues, multi-channel	3–5 hours
Twitter	Feed generation, sharding, trending, search	5–6 hours
Instagram	Image pipeline, CDN, timeline, explore/recommendations	4–6 hours
Web Crawler	BFS/DFS, URL frontier, politeness, deduplication, robots.txt	4–5 hours
Typeahead / Autocomplete	Trie data structure, top-K, distributed tries, ranking	3–5 hours

Hard — Senior & Staff-Level

These require deep understanding of distributed systems, data-intensive architectures, and nuanced trade-offs. Expected in senior and staff-level loops.

Problem	Core Concepts Tested	Time to Prep
YouTube / Netflix	Video transcoding, adaptive streaming, CDN, recommendations	6–8 hours
Uber / Lyft	Geospatial index, real-time matching, ETA, surge pricing	6–8 hours
Google Maps	Graph algorithms, map tiling, traffic, ETA computation	6–8 hours
Distributed KV Store	Consistent hashing, vector clocks, quorum, gossip	8–10 hours
Slack / Discord	Real-time messaging at scale, presence, channels, search	6–8 hours
Dropbox / Google Drive	File sync, chunking, deduplication, conflict resolution	6–8 hours

Expert — Principal & Architect

These problems have no single "right" answer and require deep domain knowledge. Expected in principal/architect interviews or as follow-up deep dives.

Problem	Core Concepts Tested	Time to Prep
Payment System	Saga pattern, idempotency, double-entry ledger, PCI	8–10 hours
Stock Exchange	Order matching, event sourcing, ultra-low latency, CQRS	10–12 hours
Search Engine	Web crawling, inverted index, PageRank, distributed indexing	10–12 hours
Ad System (Google Ads)	Real-time bidding, auction, ad serving, click prediction	8–10 hours

8 · 8-Week Study Plan

This plan covers the entire 70-post series in a structured 8-week roadmap. Each week builds on the previous one. Aim for 1–2 hours of focused study per day.

📅 Week 1 — Foundations

📅 Week 2 — Data & Consistency

📅 Week 3 — Storage & Search

📅 Week 4 — Building Blocks

📅 Week 5 — Infrastructure

Service Discovery
Heartbeat & Health Checks
Logging & Monitoring
Distributed Locking
Practice: Design a News Feed System
Practice: Design a Notification System

📅 Week 6 — Distributed Systems

📅 Week 7 — Architecture Patterns

📅 Week 8 — Case Studies & Review

Review all case studies in the series
Practice: Payment System (Expert)
Practice: Stock Exchange (Expert)
Mock interviews with a partner
Review this cheat sheet one final time
🎯 You're ready!

💡 Study strategy: Don't just read — practice. For each topic, draw the architecture on paper or a whiteboard. Explain it out loud as if you're in an interview. Time yourself. The goal is to be able to articulate any design clearly within 35 minutes.

Animation 1: Interactive Framework Walkthrough

🎬 Walk Through: "Design Twitter"

Click each step to see what you'd say and do during that phase of the interview. A timer shows the suggested time allocation.

Requirements

Estimation

High-Level Design

Deep Dive

Click a step above or press "Next Step" to begin the walkthrough.

Animation 2: Interactive Difficulty Matrix

🎯 System Design Problem Explorer

Click any problem to see its key concepts, difficulty level, and which series posts cover it. Color coded: Easy Medium Hard Expert

Click a problem above to see its details.

System Design Interview Framework & Cheat Sheet

1 · The 4-Step Framework

Step 1: Understand Requirements — In Depth

Step 2: Back-of-Envelope Estimation — In Depth

Step 3: High-Level Design — In Depth

Step 4: Deep Dive — In Depth

2 · Estimation Cheat Sheet

Powers of 2

Latency Numbers Every Engineer Should Know

Common Conversions

3 · Component Cheat Sheet

🔀 Load Balancer

🌐 CDN

⚡ Cache (Redis / Memcached)

📨 Message Queue (Kafka)

🗄️ SQL Database

📦 NoSQL Database

🔍 Search (Elasticsearch)

🪣 Object Storage (S3)

🛡️ Rate Limiter

🚪 API Gateway

🔐 Distributed Lock

📊 Monitoring & Logging

4 · Pattern-to-Problem Mapping

5 · Common Mistakes

6 · Red Flags Interviewers Watch For

7 · Topic Difficulty Tiers

Easy — Warm-up & Phone Screens

Medium — Standard On-site

Hard — Senior & Staff-Level

Expert — Principal & Architect

8 · 8-Week Study Plan

📅 Week 1 — Foundations

📅 Week 2 — Data & Consistency

📅 Week 3 — Storage & Search

📅 Week 4 — Building Blocks

📅 Week 5 — Infrastructure

📅 Week 6 — Distributed Systems

📅 Week 7 — Architecture Patterns

📅 Week 8 — Case Studies & Review

Animation 1: Interactive Framework Walkthrough

🎬 Walk Through: "Design Twitter"

Animation 2: Interactive Difficulty Matrix

🎯 System Design Problem Explorer

🎉 Series Complete!