High Level Design Series· Real-World Designs · Post 58

Design: Twitter

April 2026 · 28 min read

Requirements & Scale

Twitter (now X) is one of the most demanding real-time platforms on the internet. It combines the write-heavy nature of a messaging system with the read-heavy nature of a news feed, all under strict latency constraints. Before we design anything, let’s crystallize the requirements.

Functional Requirements

Post tweets: Users create text posts up to 280 characters, optionally with images, videos, or links.
Follow users: Users follow/unfollow other users. The follow relationship is asymmetric — following someone doesn’t mean they follow you back.
Home timeline: Display a reverse-chronological (with ranking) feed of tweets from users you follow.
User timeline: Display all tweets by a specific user.
Search: Full-text real-time search across all tweets.
Trending topics: Surface the most popular hashtags and topics in real time, globally and regionally.
Engagement: Like, retweet, reply, quote-tweet.
Notifications: Mentions, retweets, likes, and new follower alerts.

Non-Functional Requirements

Availability: 99.99% uptime — Twitter is a global communications platform.
Latency: Home timeline loads in <200ms. Tweet posting confirms in <500ms.
Consistency: Eventual consistency is acceptable for timelines (a few seconds delay is fine). Strong consistency for tweet creation and follow/unfollow.
Durability: Zero tweet loss — tweets are permanent records.

Scale Estimation

The numbers that drive every design decision:

Metric	Value	Implications
Daily Active Users (DAU)	400M	Massive read load on timelines
New tweets per day	500M	~5,800 tweets/sec average, 15K+ peak
Average follows per user	200	Each tweet fans out to ~200 timelines
Timeline reads per day	100B	~1.2M reads/sec — read:write ratio of 200:1
Average tweet size	~300 bytes (text + metadata)	~150 GB/day of tweet text alone
Media attachments	~30% of tweets have media	~15 TB/day of images and video
Search queries per day	1B	~12K queries/sec, real-time indexing needed
Celebrity accounts (>1M followers)	~50K	Fan-out on write is infeasible for these

Key insight: The 200:1 read-to-write ratio means we should optimize aggressively for reads. Pre-computing timelines (fan-out on write) trades write amplification for fast reads — but only for normal users. Celebrity accounts break this model entirely.

High-Level Architecture

The system decomposes into several core services, each scaling independently:

                           ┌─────────────────────┐
                           │    API Gateway /     │
  Mobile / Web Clients ──▶ │    Load Balancer     │
                           └──────┬──────────────┘
                                  │
           ┌──────────────────────┼──────────────────────┐
           ▼                      ▼                      ▼
   ┌──────────────┐    ┌──────────────────┐    ┌──────────────┐
   │ Tweet Service │    │ Timeline Service │    │ User Service  │
   │ (write path)  │    │  (read path)     │    │ (profiles,    │
   │               │    │                  │    │  auth, follow) │
   └──────┬───────┘    └────────┬─────────┘    └──────┬───────┘
          │                     │                      │
          ▼                     ▼                      ▼
   ┌──────────────┐    ┌──────────────────┐    ┌──────────────┐
   │  Fan-out     │    │  Timeline Cache  │    │ Social Graph  │
   │  Service     │    │  (Redis Cluster) │    │ (Graph DB /   │
   └──────┬───────┘    └──────────────────┘    │  Redis)       │
          │                                     └──────────────┘
          ▼
   ┌──────────────┐    ┌──────────────────┐    ┌──────────────┐
   │  Tweet Store  │    │ Search Service   │    │ Media Service │
   │  (MySQL/      │    │ (Elasticsearch)  │    │ (S3 + CDN)   │
   │   Cassandra)  │    └──────────────────┘    └──────────────┘
   └──────────────┘
          │            ┌──────────────────┐    ┌──────────────┐
          └──────────▶ │ Trending Service │    │ Notification  │
                       │ (Storm/Flink)    │    │ Service       │
                       └──────────────────┘    └──────────────┘

Each service is independently deployable, horizontally scalable, and communicates via a combination of synchronous RPCs (for user-facing paths) and asynchronous message queues (for fan-out, search indexing, and analytics).

Tweet Service

The tweet service handles the critical write path — accepting a new tweet, persisting it, and kicking off the downstream pipeline (fan-out, search indexing, trending, notifications).

Tweet Creation Flow

Client sends POST /api/tweets with text (max 280 chars), optional media IDs (pre-uploaded), and metadata (reply_to, quote_tweet_id).
API Gateway authenticates the request (OAuth 2.0 bearer token), checks rate limits (300 tweets/3 hours per user), and routes to the Tweet Service.
Tweet Service validates the payload:
- Text length ≤ 280 Unicode characters (not bytes — CJK characters count as one).
- Media IDs reference valid, uploaded media objects.
- Spam/abuse check via ML classifier (synchronous, <50ms).
Generate Snowflake ID (more on this below) — this becomes the tweet’s globally unique, time-sortable identifier.
Write to Tweet Store — synchronous write to the primary datastore.
Publish to message queue (Kafka) — a single event triggers multiple downstream consumers:
- Fan-out Service (timeline delivery)
- Search Indexing Pipeline
- Trending Topics Pipeline
- Notification Service
- Analytics Pipeline
Return 201 Created with the tweet object to the client.

Tweet Data Model

The tweets table is the heart of the system. It needs to support two access patterns: lookup by tweet_id (O(1)) and range scans by user_id + time (user timeline).

Column	Type	Notes
`tweet_id`	BIGINT (Snowflake ID)	Primary key, time-sortable
`user_id`	BIGINT	Author, indexed for user timeline
`text`	VARCHAR(1120)	280 chars × 4 bytes (UTF-8 max)
`media_ids`	JSON / ARRAY	References to media objects
`reply_to_id`	BIGINT, nullable	Parent tweet for threading
`retweet_of_id`	BIGINT, nullable	Original tweet for retweets
`quote_tweet_id`	BIGINT, nullable	Quoted tweet reference
`like_count`	INT	Denormalized counter
`retweet_count`	INT	Denormalized counter
`reply_count`	INT	Denormalized counter
`hashtags`	JSON / ARRAY	Extracted hashtags for trending
`language`	CHAR(2)	Detected language code
`geo`	POINT, nullable	Optional geolocation
`created_at`	TIMESTAMP	Redundant with Snowflake ID, but useful

Partitioning Strategy

At 500M tweets/day, a single database is impossible. The tweet store is partitioned by tweet_id (specifically, by the Snowflake ID):

Why not partition by user_id? Because celebrity users (like @barackobama with 130M followers) would create massive hot partitions. Partitioning by tweet_id distributes writes uniformly across shards since Snowflake IDs embed a random worker component.
Shard count: ~4,096 logical shards, mapped to ~256 physical database nodes via consistent hashing. Each shard handles ~120K tweets/day — well within a single MySQL/Cassandra node’s write capacity.
User timeline queries (SELECT * FROM tweets WHERE user_id = ? ORDER BY tweet_id DESC LIMIT 20) require a scatter-gather across shards. This is acceptable because:
- User timeline is a secondary access pattern (home timeline is primary).
- We maintain a secondary index table partitioned by user_id that maps to tweet_ids, avoiding full scatter-gather.

-- Secondary index table (partitioned by user_id)
CREATE TABLE user_tweets (
    user_id    BIGINT,
    tweet_id   BIGINT,     -- Snowflake ID (also encodes timestamp)
    PRIMARY KEY (user_id, tweet_id)
) WITH CLUSTERING ORDER BY (tweet_id DESC);

-- Query user timeline: single partition read
SELECT tweet_id FROM user_tweets
WHERE user_id = 12345
LIMIT 20;

Snowflake ID Generation

Twitter invented the Snowflake ID scheme to generate globally unique, roughly time-ordered 64-bit IDs without coordination between servers. This is critical because:

Time-ordering: Tweets in a timeline are ordered by ID, which is equivalent to ordering by creation time. No need for a separate timestamp sort.
No coordination: Each worker generates IDs independently — no central authority, no distributed locks, no database sequences.
Compact: 64-bit integers are cache-friendly and fit in a single CPU register.

Bit Layout

┌─────────────────────────────────────────────────────────────────┐
│                        64-bit Snowflake ID                       │
├──────┬────────────────────────┬──────────┬──────────┬───────────┤
│ Sign │  Timestamp (ms)        │ Datacenter│ Worker  │ Sequence  │
│ 1 bit│  41 bits               │ 5 bits    │ 5 bits  │ 12 bits   │
├──────┼────────────────────────┼──────────┼──────────┼───────────┤
│  0   │ ms since custom epoch  │  0-31     │  0-31   │  0-4095   │
│      │ (2010-11-04 01:42:54)  │           │         │           │
└──────┴────────────────────────┴──────────┴──────────┴───────────┘

Key properties of this layout:

Component	Bits	Range	Implication
Sign	1	Always 0	Keeps IDs positive in signed 64-bit integers
Timestamp	41	~69.7 years	IDs are time-sortable; epoch runs until ~2079
Datacenter ID	5	0–31	Supports 32 datacenters
Worker ID	5	0–31	32 workers per datacenter = 1,024 total workers
Sequence	12	0–4095	4,096 IDs per millisecond per worker

Total throughput: 1,024 workers × 4,096 IDs/ms = 4,194,304 IDs per millisecond — far exceeding Twitter’s peak of ~15K tweets/sec. The sequence counter resets every millisecond; if a worker exhausts 4,096 sequences within 1ms, it busy-waits until the next millisecond.

Implementation Sketch

class SnowflakeGenerator:
    EPOCH = 1288834974657  # Custom epoch: 2010-11-04T01:42:54.657Z

    def __init__(self, datacenter_id, worker_id):
        assert 0 <= datacenter_id < 32
        assert 0 <= worker_id < 32
        self.datacenter_id = datacenter_id
        self.worker_id = worker_id
        self.sequence = 0
        self.last_timestamp = -1

    def next_id(self):
        timestamp = current_time_ms()

        if timestamp == self.last_timestamp:
            self.sequence = (self.sequence + 1) & 0xFFF  # 12-bit mask
            if self.sequence == 0:
                timestamp = self._wait_next_ms(timestamp)
        else:
            self.sequence = 0

        self.last_timestamp = timestamp

        return ((timestamp - self.EPOCH) << 22 |
                (self.datacenter_id << 17) |
                (self.worker_id << 12) |
                self.sequence)

    def _wait_next_ms(self, last_ts):
        ts = current_time_ms()
        while ts <= last_ts:
            ts = current_time_ms()
        return ts

Clock skew problem: If a worker’s clock drifts backward (e.g., NTP adjustment), it could generate duplicate IDs or break ordering. Snowflake handles this by refusing to generate IDs when the clock goes backward — the worker throws an exception and the request is retried on another worker. In practice, operators ensure NTP is configured with tinker panic 0 to prevent large jumps.

Timeline Service — Hybrid Fan-Out

The home timeline is the single most important and most complex feature. Every time a user opens Twitter, they expect to see a fresh, personalized feed of tweets from people they follow — in under 200ms. This is where the hybrid fan-out approach comes in.

Three Approaches to Timeline Generation

Approach 1: Fan-Out on Read (Pull Model)

GET /home_timeline?user_id=123

1. Look up user 123's following list: [user_A, user_B, user_C, ...]
2. For each followed user, query their recent tweets
3. Merge and sort all tweets by timestamp
4. Return top N tweets

-- SQL equivalent (conceptual):
SELECT * FROM tweets
WHERE user_id IN (SELECT followee_id FROM follows WHERE follower_id = 123)
ORDER BY tweet_id DESC
LIMIT 20;

Problem: If a user follows 500 people, this is 500 database queries per timeline load. At 1.2M timeline reads/sec, that’s 600M database queries/sec — completely infeasible. Latency would be seconds, not milliseconds.

Approach 2: Fan-Out on Write (Push Model)

User A posts a tweet:

1. Look up A's followers: [user_123, user_456, user_789, ...]
2. For EACH follower, append tweet_id to their timeline cache
3. When user_123 requests home timeline, just read from their pre-built cache

-- Redis operation:
LPUSH timeline:123 <tweet_id>
LPUSH timeline:456 <tweet_id>
LPUSH timeline:789 <tweet_id>
LTRIM timeline:123 0 799   -- Keep only last 800 tweets

Problem: When a celebrity with 50M followers tweets, the fan-out service must write to 50 million timeline caches. At 300 bytes per entry, that’s 15GB of cache writes for a single tweet. With celebrities tweeting 5–20 times per day, the write amplification is catastrophic.

Approach 3: Hybrid (Twitter’s Actual Approach)

Twitter uses a hybrid model that combines the best of both approaches:

Normal users (<10K followers): Fan-out on write. When they tweet, push tweet_id into each follower’s timeline cache. This covers ~99% of all tweets.
Celebrity users (≥10K followers): Fan-out on read. Their tweets are not pushed to follower caches. Instead, when a user loads their timeline, the system merges:
1. Pre-built cache (normal user tweets, pushed)
2. Recent tweets from followed celebrities (pulled in real time)

Why 10K as the threshold? It’s a practical heuristic. A user with 10K followers means 10K cache writes per tweet — that completes in ~20ms on a Redis cluster. At 1M followers, the same operation takes 2+ seconds and consumes significant network/memory resources. The threshold is tunable and varies by system load.

Timeline Cache Design in Redis

The timeline cache is a Redis Cluster storing pre-computed timelines. This is the most latency-sensitive component in the entire system.

Data Structure

Key:    timeline:{user_id}
Type:   Sorted Set (ZSET)
Score:  Snowflake ID (which is time-sortable)
Member: tweet_id (as string)

-- Fan-out write: celebrity tweets handled separately
ZADD timeline:123 1780012345678901234 "1780012345678901234"

-- Read home timeline (newest 20 tweets)
ZREVRANGEBYSCORE timeline:123 +inf -inf LIMIT 0 20

-- Trim to keep cache bounded (800 entries max)
ZREMRANGEBYRANK timeline:123 0 -801

Key design decisions for the Redis timeline cache:

Decision	Choice	Rationale
Data structure	Sorted Set (ZSET)	O(log N) insert and range queries by score. Score = Snowflake ID gives time ordering for free.
What we store	Only tweet_ids, not full tweets	Keeps cache compact. Full tweet data fetched from Tweet Cache on read (batch MGET). A timeline of 800 tweet_ids is ~6.4KB vs ~240KB for full tweet objects.
Cache size per user	800 tweet_ids	Covers ~2–3 days of content for an average user following 200 accounts. Beyond this, we fall back to database queries.
Eviction	Manual trim after each insert	ZREMRANGEBYRANK keeps exactly 800 entries. No reliance on Redis eviction policies.
Cluster topology	Redis Cluster, 1000+ nodes	400M users × 6.4KB = ~2.5TB minimum. With replication factor 3 and headroom: ~10TB cluster.
Persistence	None (pure cache)	If a node dies, timelines are rebuilt from the Tweet Store. Cold-start penalty is acceptable for rare events.

Timeline Read Path (Merged)

def get_home_timeline(user_id, count=20):
    # Step 1: Get pre-built timeline from cache (normal user tweets)
    cached_tweet_ids = redis.zrevrangebyscore(
        f"timeline:{user_id}", "+inf", "-inf",
        start=0, num=count * 2  # Over-fetch for merge headroom
    )

    # Step 2: Get followed celebrities
    celebrity_followees = get_celebrity_followees(user_id)

    # Step 3: Pull recent tweets from each celebrity (fan-out on read)
    celebrity_tweets = []
    for celeb_id in celebrity_followees:
        tweets = redis.zrevrangebyscore(
            f"user_tweets:{celeb_id}", "+inf", "-inf",
            start=0, num=count
        )
        celebrity_tweets.extend(tweets)

    # Step 4: Merge, sort by Snowflake ID (= chronological), take top N
    all_tweet_ids = cached_tweet_ids + celebrity_tweets
    all_tweet_ids.sort(reverse=True)  # Descending = newest first
    top_ids = all_tweet_ids[:count]

    # Step 5: Hydrate tweet objects (batch fetch from Tweet Cache)
    tweets = tweet_cache.mget(top_ids)  # Redis MGET, <1ms

    # Step 6: Hydrate user profiles (batch fetch)
    user_ids = {t.user_id for t in tweets}
    users = user_cache.mget(user_ids)

    return enrich(tweets, users)

The average user follows 3–5 celebrities. So step 3 adds only 3–5 Redis lookups to the timeline read — negligible latency overhead.

▶ Tweet Delivery Pipeline

Click Step to begin

The social graph manages the follow/unfollow relationships between users. It answers three critical questions:

Who does user X follow? (Following list — needed for fan-out on read of celebrity tweets)
Who follows user X? (Follower list — needed for fan-out on write)
Does user X follow user Y? (Follow check — needed for UI rendering and access control)

Data Model Options

Option A: Adjacency List in Redis

-- Who does user 123 follow?
Key:   following:123     Type: SET
Members: {456, 789, 101, ...}

-- Who follows user 456?
Key:   followers:456     Type: SET
Members: {123, 555, 888, ...}

-- Follow user
SADD following:123 456
SADD followers:456 123

-- Unfollow
SREM following:123 456
SREM followers:456 123

-- Check if following
SISMEMBER following:123 456  → 1 (yes) or 0 (no)

-- Follower count
SCARD followers:456 → 1,340,201

Pros: O(1) follow check, O(1) count, blazing fast. Cons: Memory-intensive. A user with 50M followers has a 50M-member set (~400MB). For celebrities, we store only the count in Redis and page through followers from the database for fan-out operations.

Option B: Graph Database (e.g., FlockDB, TAO)

Twitter originally built FlockDB, a distributed graph database optimized for adjacency-list queries. Facebook uses TAO for the same purpose. Key characteristics:

Stores edges as (source_id, edge_type, destination_id, timestamp)
Sharded by source_id for fast “who do I follow?” queries
Reverse index sharded by destination_id for “who follows me?”
Supports intersection queries: “which of my friends also follow user X?”

Recommended: Hybrid

Use Redis for hot-path queries (follow check, counts, following list for timeline merge) and a persistent store (MySQL or FlockDB-style) as the source of truth. Redis is populated on follow/unfollow and warmed on user login.

Follow/Unfollow Flow

POST /api/users/123/follow  body: { target_user_id: 456 }

1. Validate: user 123 exists, user 456 exists, not already following
2. Write to persistent store:
   INSERT INTO follows (follower_id, followee_id, created_at) VALUES (123, 456, NOW())
3. Update Redis (async via message queue for consistency):
   SADD following:123 456
   SADD followers:456 123
   INCR follower_count:456
   INCR following_count:123
4. If user 456 is a normal user (<10K followers):
   → Backfill: push 456's recent tweets into 123's timeline cache
5. If user 456 is a celebrity:
   → Add 456 to 123's celebrity-followee set (for fan-out on read)
6. Send notification to user 456: "user 123 started following you"

Search Service

Twitter search must index 500M tweets per day and make them searchable within seconds of posting. This is one of the most demanding real-time search systems in the world.

Architecture: Earlybird + Elasticsearch

Twitter uses a search architecture called Earlybird, which is a custom real-time inverted index optimized for recency. Let’s build something similar:

Two-Tier Search

┌──────────────────────────────────────────────────┐
│               Search Query Flow                   │
│                                                    │
│  Query ──▶ Blender (merges results from both)     │
│              │                    │                 │
│              ▼                    ▼                 │
│    ┌─────────────────┐  ┌─────────────────────┐   │
│    │  Earlybird       │  │  Full Index          │   │
│    │  (Real-Time)     │  │  (Elasticsearch)     │   │
│    │                   │  │                       │   │
│    │  Last ~10 sec of  │  │  Historical tweets    │   │
│    │  tweets, in-memory│  │  (days/weeks/months)  │   │
│    │  inverted index   │  │  sharded by time      │   │
│    └─────────────────┘  └─────────────────────┘   │
└──────────────────────────────────────────────────┘

Earlybird: Real-Time Inverted Index

Earlybird is an in-memory, append-only inverted index that processes the most recent tweets (last ~10 seconds). It’s designed for the common case: users searching for breaking news and trending topics want the absolute freshest results.

class EarlybirdIndex:
    def __init__(self):
        self.inverted_index = {}   # token → sorted list of tweet_ids
        self.tweet_store = {}      # tweet_id → tweet metadata

    def index_tweet(self, tweet):
        tokens = tokenize(tweet.text)  # Lowercase, remove stopwords, stem
        for token in tokens:
            if token not in self.inverted_index:
                self.inverted_index[token] = []
            self.inverted_index[token].append(tweet.tweet_id)
        self.tweet_store[tweet.tweet_id] = tweet

    def search(self, query, limit=20):
        tokens = tokenize(query)
        if len(tokens) == 1:
            posting_list = self.inverted_index.get(tokens[0], [])
        else:
            # Intersect posting lists for multi-term queries
            sets = [set(self.inverted_index.get(t, [])) for t in tokens]
            posting_list = sorted(set.intersection(*sets), reverse=True)
        return posting_list[:limit]

    def gc(self, max_age_seconds=10):
        """Evict tweets older than max_age_seconds to keep memory bounded."""
        cutoff_id = snowflake_id_from_timestamp(now() - max_age_seconds)
        # Remove from inverted index and tweet store
        ...

Earlybird runs on multiple shards (partitioned by hash of tweet_id) with each shard handling ~500 tweets/sec of indexing throughput. A query fans out to all Earlybird shards in parallel, and results are merged by the Blender service.

Full Index: Elasticsearch Cluster

Tweets older than ~10 seconds flow into Elasticsearch for long-term searchability:

Index sharding: Time-based indices (e.g., tweets-2026-04-15) — one index per day. Older indices are merged and moved to cheaper storage.
Mapping: Text field analyzed with custom Twitter tokenizer (handles @mentions, #hashtags, URLs, emojis). Keyword fields for exact-match filters (language, user_id).
Replication: 2 replicas per shard for read throughput and fault tolerance.
Cluster size: ~2,000 nodes for the full tweet index. Hot nodes (SSDs) for recent indices, warm nodes (HDDs) for historical data.

Indexing Pipeline

Tweet posted
    → Kafka topic: "new-tweets"
        → Earlybird Consumer: index in-memory (within 1 second)
        → Elasticsearch Consumer: bulk index (within 5-10 seconds)
            - Tokenize text (custom analyzer: @mentions, #hashtags, URLs)
            - Extract entities (people, places, events)
            - Detect language
            - Compute engagement signals (initial = 0, updated periodically)
            - Write to daily ES index with tweet_id as document ID

▶ Real-Time Search Indexing

Click Step to begin

Trending topics surface what the world is talking about right now. The algorithm must be real-time (detect trends within minutes), noise-resistant (filter spam and evergreen topics), and geographically aware.

The core algorithm uses a sliding window approach over hashtag frequencies, detecting acceleration rather than absolute volume:

For each hashtag H in the last 5 minutes:
    current_count   = count(H, window=last_5_min)
    baseline_count  = count(H, window=last_24_hours) / 288  # Normalized to 5-min
    trend_score     = (current_count - baseline_count) / max(baseline_count, 1)

Trending if:
    trend_score > THRESHOLD (e.g., 3.0)  AND
    current_count > MIN_VOLUME (e.g., 500)  AND
    NOT in spam_blocklist  AND
    NOT evergreen (e.g., "love", "good morning")

This means a hashtag trending isn’t about being the most used — it’s about having the highest acceleration. #Love might appear in 100K tweets/5min, but if that’s its normal rate, it doesn’t trend. #SuperBowl going from 100 to 50K in 5 minutes is a trend.

┌──────────┐     ┌──────────────┐     ┌──────────────────┐     ┌────────────┐
│ Kafka    │ ──▶ │ Flink/Storm  │ ──▶ │ Trend Detector   │ ──▶ │ Trending   │
│ new-     │     │ Hashtag      │     │ (sliding window  │     │ Cache      │
│ tweets   │     │ Extractor    │     │  + scoring)      │     │ (Redis)    │
└──────────┘     └──────────────┘     └──────────────────┘     └────────────┘
                                             │
                                             ▼
                                      ┌──────────────┐
                                      │ Spam Filter   │
                                      │ (ML model +   │
                                      │  blocklist)   │
                                      └──────────────┘

Multi-Level Trending

Global trends: Aggregated across all tweets worldwide.
Country trends: Filtered by user’s country (from profile or IP geolocation).
City trends: Further localized for major metros. Requires geo-tagged tweets or location inference from user profiles.
Personalized trends: Weighted by the user’s interests and following graph. If you follow many sports accounts, sports trends rank higher.

Without filtering, trending topics would be dominated by spam bots and coordinated manipulation campaigns. Multiple layers of defense:

Velocity check: If a hashtag goes from 0 to 10K in under 60 seconds, it’s likely bot-driven. Legitimate trends build over minutes, not seconds.
Account diversity: Count unique accounts, not just tweet volume. 10K tweets from 100 accounts is suspicious; 10K tweets from 8K accounts is organic.
Account age & quality: Weigh tweets from established accounts higher. New accounts (<7 days) and accounts with spam signals get reduced weight.
Content analysis: ML classifier detects coordinated inauthentic behavior — similar text patterns, same timing, network graph clustering.
Evergreen blocklist: Common everyday phrases (#love, #goodmorning, #tbt) are excluded from trending even if they spike.

Media Service

~30% of tweets contain images or video. The media pipeline must handle high-throughput uploads, transcoding, and low-latency delivery.

Upload Flow

1. Client requests upload URL:
   POST /api/media/upload/init → returns { media_id, upload_url (S3 presigned) }

2. Client uploads directly to S3 (bypasses our servers entirely):
   PUT <upload_url>  body: <binary image/video data>

3. S3 triggers Lambda/worker for post-processing:
   - Images: resize to standard dimensions (150x150 thumb, 680px medium, original)
   - Video: transcode to multiple bitrates (360p, 720p, 1080p) via FFmpeg
   - Generate blurhash placeholder for progressive loading
   - Store all variants back in S3

4. Media service marks media_id as "ready"

5. Client includes media_id in tweet creation:
   POST /api/tweets  body: { text: "...", media_ids: ["abc123"] }

Delivery via CDN

CDN: CloudFront / Akamai with edge caches in 200+ cities. Popular images are served from edge with <10ms latency.
URL structure: https://pbs.twimg.com/media/{media_id}?format=jpg&name=medium
Cache hit ratio: ~95% for images (popular tweets get millions of views). Video caching is lower (~70%) due to size.
Cost optimization: Cold media (>30 days, low access) moved to S3 Infrequent Access tier. Glacier for media older than 1 year.

Notification Service

Notifications keep users engaged by alerting them to interactions: mentions, retweets, likes, replies, and new followers.

Notification Types & Priorities

Type	Trigger	Priority	Delivery
Mention (@user)	Tweet text parsing	High	Push + In-app
Reply	Tweet with reply_to_id	High	Push + In-app
Retweet	Retweet action	Medium	Push + In-app
Like	Like action	Low	In-app only (batched)
New follower	Follow action	Low	In-app (batched)
Trending topic	Personalized trend match	Low	Push (opt-in)

Architecture

Event Sources (Kafka topics)
    │
    ▼
Notification Router
    ├── Priority queue (high/medium/low)
    ├── Deduplication (don't notify same event twice)
    ├── User preference check (muted? notification settings?)
    └── Rate limiting (max 50 push notifications/hour per user)
         │
         ├──▶ Push Notification Service (APNs / FCM)
         ├──▶ In-App Notification Store (Cassandra)
         └──▶ Email Digest Service (batched, daily/weekly)

Celebrity notification problem: When a tweet by @KatyPerry (100M+ followers) gets liked, we don’t send 100M notifications. Instead, likes are aggregated: “@user1, @user2, and 47,382 others liked your tweet.” The aggregation happens in the notification store, not in delivery.

Rate Limiting

Rate limiting protects the system from abuse, runaway clients, and denial-of-service attacks. Twitter applies limits at multiple layers:

Rate Limit Tiers

Endpoint	Limit	Window	Rationale
POST /tweets	300 tweets	3 hours	Prevents spam bots from flooding
POST /likes	1,000 likes	24 hours	Prevents artificial engagement inflation
POST /follows	400 follows	24 hours	Prevents mass follow-unfollow schemes
GET /home_timeline	150 requests	15 minutes	Prevents scraping, protects Redis cache
GET /search	180 requests	15 minutes	Protects Elasticsearch cluster
App-level (OAuth)	300 requests	15 minutes	Per-app rate limit across all users

Implementation

Rate limiting is implemented at the API Gateway using a sliding window counter backed by Redis:

-- Sliding window rate limit check (Lua script for atomicity)
local key = KEYS[1]                    -- e.g., "ratelimit:tweets:user:123"
local window = tonumber(ARGV[1])       -- e.g., 10800 (3 hours in seconds)
local max_requests = tonumber(ARGV[2]) -- e.g., 300
local now = tonumber(ARGV[3])          -- current timestamp

-- Remove entries outside the window
redis.call('ZREMRANGEBYSCORE', key, 0, now - window)

-- Count requests in current window
local count = redis.call('ZCARD', key)

if count < max_requests then
    -- Allow: add this request
    redis.call('ZADD', key, now, now .. ':' .. math.random(1000000))
    redis.call('EXPIRE', key, window)
    return {1, max_requests - count - 1}  -- allowed, remaining
else
    return {0, 0}  -- denied, 0 remaining
end

Rate limit responses include standard headers:

HTTP/1.1 429 Too Many Requests
X-Rate-Limit-Limit: 300
X-Rate-Limit-Remaining: 0
X-Rate-Limit-Reset: 1713456789
Retry-After: 1823

Celebrity Tweet Handling

Celebrities are the hardest scaling problem on Twitter. An account with 50M followers breaks the fan-out model. Here’s how the system handles it end-to-end:

Dynamic Celebrity Classification

User Classification Logic:
    if follower_count < 10,000:
        user_tier = "normal"      → Fan-out on write
    elif follower_count < 1,000,000:
        user_tier = "popular"     → Fan-out on write with async batching
    else:
        user_tier = "celebrity"   → Fan-out on read only

-- Classification is updated hourly via a background job
-- Tier changes trigger timeline cache migration:
--   normal → celebrity: stop pushing, add to celebrity index
--   celebrity → normal: start pushing, backfill timelines

Celebrity Tweet Cache

Celebrity tweets are stored in a dedicated cache for fast retrieval during timeline merge:

Key:    celebrity_tweets:{user_id}
Type:   Sorted Set (ZSET)
Score:  Snowflake ID
Member: tweet_id

-- Elon Musk tweets → stored here, NOT pushed to 150M timeline caches
ZADD celebrity_tweets:44196397 1780099887766554433 "1780099887766554433"

-- When follower 123 loads timeline:
-- 1. Read their pre-built cache (normal tweets)
-- 2. For each celebrity they follow, ZREVRANGE celebrity_tweets:{celeb_id}
-- 3. Merge all results by Snowflake ID

This trades a small read-time cost (~3–5 extra Redis calls per timeline read) for eliminating millions of cache writes per celebrity tweet.

Engagement Counts at Scale

A viral celebrity tweet can accumulate millions of likes/retweets per minute. Updating the like_count column directly would create a hot row. Instead:

Count buffering: Likes are accumulated in Redis (INCR tweet_likes:<tweet_id>) and flushed to the database in batches every 30 seconds.
Approximate display: The UI shows “1.2M likes” (approximate) rather than “1,234,567 likes” (exact), allowing eventual consistency.
Sharded counters: For extremely hot tweets, the counter is sharded across multiple Redis keys (tweet_likes:<tweet_id>:shard_0 through :shard_15) and summed on read.

User Timeline

Unlike the home timeline, the user timeline is straightforward: it shows all tweets by a single user, in reverse chronological order. This is a simpler problem:

GET /api/users/456/tweets?cursor=<last_tweet_id>&count=20

Option A: Query the secondary index table
    SELECT tweet_id FROM user_tweets
    WHERE user_id = 456
    AND tweet_id < <cursor>
    ORDER BY tweet_id DESC
    LIMIT 20;

Option B: Redis sorted set (for hot profiles)
    ZREVRANGEBYSCORE user_tweets:456 (<cursor> -inf LIMIT 0 20

For frequently accessed profiles (celebrities, trending accounts), the user timeline is cached in a Redis sorted set identical to the celebrity tweet cache. For long-tail profiles, the database secondary index is sufficient.

Infrastructure & Data Flow Summary

Kafka Topic Architecture

Kafka is the nervous system that connects all services. Every significant event flows through Kafka for decoupled, asynchronous processing:

Topic: new-tweets         (partitioned by user_id, 256 partitions)
  Consumers: fan-out-service, search-indexer, trending-pipeline,
             notification-router, analytics-pipeline

Topic: engagements        (partitioned by tweet_id, 128 partitions)
  Consumers: counter-service, notification-router, ranking-pipeline

Topic: social-graph       (partitioned by user_id, 64 partitions)
  Consumers: timeline-backfill, notification-router, recommendation-engine

Topic: user-activity      (partitioned by user_id, 128 partitions)
  Consumers: analytics-pipeline, ad-targeting, personalization

Complete Write Path: Posting a Tweet

User taps "Tweet" on mobile app
    │
    ▼
API Gateway (rate limit check, auth)
    │
    ▼
Tweet Service
    ├── Generate Snowflake ID
    ├── Validate text (280 char limit, spam check)
    ├── Write to Tweet Store (MySQL/Cassandra)
    ├── Write to user_tweets secondary index
    ├── Update user's tweet_count
    └── Publish to Kafka "new-tweets" topic
         │
         ├──▶ Fan-out Service
         │     ├── Check user tier (normal/popular/celebrity)
         │     ├── If normal: get follower list, ZADD to each timeline cache
         │     ├── If celebrity: ZADD to celebrity_tweets:{user_id} only
         │     └── Async, batched, with backpressure
         │
         ├──▶ Search Indexer
         │     ├── Earlybird: index in-memory (real-time, <1 sec)
         │     └── Elasticsearch: bulk index (near-real-time, <10 sec)
         │
         ├──▶ Trending Pipeline (Flink)
         │     ├── Extract hashtags
         │     ├── Update sliding window counters
         │     └── Recompute trend scores every 30 seconds
         │
         ├──▶ Notification Router
         │     ├── Parse @mentions → notify mentioned users
         │     ├── If reply → notify parent tweet author
         │     └── Push via APNs/FCM
         │
         └──▶ Analytics Pipeline
               ├── Tweet volume metrics
               ├── User engagement tracking
               └── Ad impression attribution

Complete Read Path: Loading Home Timeline

User opens Twitter app
    │
    ▼
API Gateway → Timeline Service
    │
    ├── Step 1: Read pre-built timeline from Redis
    │   ZREVRANGEBYSCORE timeline:123 +inf -inf LIMIT 0 40
    │   Result: [tweet_id_1, tweet_id_2, ..., tweet_id_40]
    │
    ├── Step 2: Get celebrity followees
    │   SMEMBERS celebrity_followees:123
    │   Result: [celeb_A, celeb_B, celeb_C]
    │
    ├── Step 3: Pull celebrity tweets
    │   For each celeb: ZREVRANGEBYSCORE celebrity_tweets:{celeb} +inf -inf LIMIT 0 20
    │   Result: [celeb_tweet_1, celeb_tweet_2, ...]
    │
    ├── Step 4: Merge & sort all tweet_ids by Snowflake ID (descending)
    │   Take top 20
    │
    ├── Step 5: Hydrate tweets (batch fetch)
    │   MGET tweet:id1 tweet:id2 ... tweet:id20
    │   Cache miss? → Query Tweet Store → populate cache
    │
    ├── Step 6: Hydrate user profiles (batch fetch)
    │   MGET user:uid1 user:uid2 ...
    │
    ├── Step 7: (Optional) Apply ranking model
    │   ML model scores tweets by predicted engagement
    │   Re-sort by score instead of chronological
    │
    └── Return JSON response with tweets, user objects, cursor
        Total latency target: <200ms (p99)

Failure Handling & Reliability

Key Failure Scenarios

Failure	Impact	Mitigation
Redis timeline cache node dies	Affected users see empty/stale timeline	Replica takes over. Cold-start rebuild from Tweet Store + Fan-out replay from Kafka.
Kafka broker goes down	Fan-out, search indexing delayed	Kafka replication factor 3. Consumers resume from last committed offset. No data loss.
Tweet Store shard unreachable	Writes fail for affected partition	Circuit breaker returns 503. Retries with exponential backoff. Cross-region failover.
Elasticsearch cluster degraded	Search latency increases or partial results	Reduce replica count, shed load to Earlybird-only results. Graceful degradation.
Fan-out service backlog	Timelines are stale (tweets appear late)	Backpressure via Kafka lag monitoring. Scale fan-out workers. Celebrity threshold lowered temporarily.
Snowflake worker clock skew	Potential ID collision or ordering issues	Worker refuses to generate IDs if clock goes backward. Alert + NTP resync.

Graceful Degradation Strategy

When the system is under extreme load (e.g., Super Bowl, breaking news), degrade gracefully instead of failing entirely:

Level 1 — Disable non-essential features: Turn off “Who to follow” suggestions, personalized trends, and ad injection.
Level 2 — Reduce timeline freshness: Serve cached timelines even if slightly stale. Skip celebrity tweet merge.
Level 3 — Switch to pull-only: Temporarily disable fan-out on write entirely. All timelines served via fan-out on read (higher latency but lower write load).
Level 4 — Read-only mode: Stop accepting new tweets. Serve existing content only. Nuclear option for survival.

Summary & Key Takeaways

Designing Twitter at scale reveals several fundamental distributed systems patterns:

Design Decision	Pattern	Why It Matters
Hybrid fan-out	Push for normal, pull for celebrities	Balances write amplification vs read latency
Snowflake IDs	Distributed ID generation without coordination	Time-sortable, globally unique, no bottleneck
Timeline cache in Redis	Pre-computed materialized views	Trades storage for sub-200ms reads
Two-tier search	Earlybird (real-time) + ES (historical)	Freshness for trending, depth for historical
Trending via acceleration	Sliding window with baseline comparison	Detects spikes, not just high volume
Kafka as event bus	Async, decoupled event processing	One write triggers fan-out, search, trending, notifications
Sharded counters	Distributed counting for hot keys	Avoids hot rows for viral content

The core lesson: there is no single “best” approach. Fan-out on write is perfect for 99% of users but catastrophic for celebrities. Fan-out on read works universally but can’t meet latency targets at scale. The hybrid approach acknowledges that different users have fundamentally different characteristics and optimizes accordingly.

Design: Twitter

Requirements & Scale

Functional Requirements

Non-Functional Requirements

Scale Estimation

High-Level Architecture

Tweet Service

Tweet Creation Flow

Tweet Data Model

Partitioning Strategy

Snowflake ID Generation

Bit Layout

Implementation Sketch

Timeline Service — Hybrid Fan-Out

Three Approaches to Timeline Generation

Approach 1: Fan-Out on Read (Pull Model)

Approach 2: Fan-Out on Write (Push Model)

Approach 3: Hybrid (Twitter’s Actual Approach)

Timeline Cache Design in Redis

Data Structure

Timeline Read Path (Merged)

▶ Tweet Delivery Pipeline

Social Graph Service

Data Model Options

Option A: Adjacency List in Redis

Option B: Graph Database (e.g., FlockDB, TAO)

Recommended: Hybrid

Follow/Unfollow Flow

Search Service

Architecture: Earlybird + Elasticsearch

Two-Tier Search

Earlybird: Real-Time Inverted Index

Full Index: Elasticsearch Cluster

Indexing Pipeline

▶ Real-Time Search Indexing

Trending Topics

Trending Algorithm: Sliding Window Frequency Count

Implementation with Stream Processing

Multi-Level Trending

Noise & Spam Filtering

Media Service

Upload Flow

Delivery via CDN

Notification Service

Notification Types & Priorities

Architecture

Rate Limiting

Rate Limit Tiers

Implementation

Celebrity Tweet Handling

Dynamic Celebrity Classification

Celebrity Tweet Cache

Engagement Counts at Scale

User Timeline

Infrastructure & Data Flow Summary

Kafka Topic Architecture

Complete Write Path: Posting a Tweet

Complete Read Path: Loading Home Timeline

Failure Handling & Reliability

Key Failure Scenarios

Graceful Degradation Strategy

Summary & Key Takeaways