High Level Design Series · Real-World Designs· Post 47 of 70

Design: Unique ID Generator

April 2026 · 22 min read

The Problem

Every distributed system needs a way to uniquely identify entities — users, orders, messages, transactions, events. In a single-server world you'd just use an auto-incrementing integer. But the moment you add a second server, that approach breaks: two servers could issue the same ID at the same time.

The unique ID generator is one of the most deceptively simple system design problems. The naive solutions are easy to spot, but the optimal solution — Twitter's Snowflake ID — is a masterclass in using bit-level encoding to satisfy multiple requirements simultaneously with zero coordination between servers.

Interview tip: This problem is a favorite in system design interviews because it tests your understanding of distributed systems, bit manipulation, clock synchronization, and the trade-offs between coordination and independence.

Requirements

Functional Requirements

IDs must be globally unique — no two IDs should ever collide, even across data centers
IDs must be 64-bit numeric — they must fit in a standard long/int64 type
IDs must be roughly sortable by time — newer IDs should be larger than older IDs
IDs must contain only numerical values — no UUIDs or strings

Non-Functional Requirements

High throughput: Each server must generate at least 10,000 IDs per second
Low latency: ID generation must be a local operation — no network calls, no locks, no coordination
High availability: The ID generator must never be a single point of failure
Scalability: Adding more servers should not require reconfiguring existing ones

Back-of-the-Envelope Estimation

With 10,000 IDs/sec per server and, say, 100 servers:

Total throughput: 10,000 × 100 = 1,000,000 IDs/sec
Per day: 1M × 86,400 = 86.4 billion IDs/day
Per year: ~31.5 trillion IDs/year
In 69 years (Snowflake's timestamp range): ~2.17 quadrillion IDs — well within 64-bit range (2⁶³ ≈ 9.2 quintillion)

Approach 1: UUID

The most obvious answer: use a Universally Unique Identifier (UUID). A UUID is a 128-bit number, typically represented as a 36-character hex string:

550e8400-e29b-41d4-a716-446655440000
         │       │    │    │    │
         time_lo  mid  hi   clk  node (for v1)
         random  random random random random (for v4)

UUIDv4 (the most common variant) is generated from 122 random bits. The probability of collision with 10¹⁸ UUIDs is approximately 10^-19 — essentially zero.

Why UUID Doesn't Meet Our Requirements

Requirement	UUID
Globally unique	✅ Yes (practically zero collision probability)
64-bit numeric	❌ 128 bits, not 64
Sortable by time	❌ UUIDv4 is completely random
Numeric only	❌ Contains hex characters and dashes
No coordination	✅ Each server generates independently

UUIDv4 is a fine general-purpose identifier, but it fails our requirements on three counts. Also, 128 bits wastes storage and makes indexing slower — B-tree index nodes hold fewer keys, leading to more disk I/O for range scans.

UUIDv7 (2022 standard): The newer UUIDv7 embeds a Unix timestamp in the most significant bits, making it time-sortable. However, it's still 128 bits, so it doesn't fit our 64-bit constraint. It's worth mentioning in an interview, though.

Approach 2: Database Auto-Increment

The simplest distributed approach: use a single database server with an auto-incrementing primary key.

CREATE TABLE id_generator (
    id BIGINT AUTO_INCREMENT PRIMARY KEY,
    stub CHAR(1) NOT NULL DEFAULT 'x',
    UNIQUE KEY stub (stub)
);

-- Generate a new ID:
REPLACE INTO id_generator (stub) VALUES ('x');
SELECT LAST_INSERT_ID();

This is essentially what Flickr used in their early days (the "ticket server" approach). The REPLACE INTO trick avoids ever growing the table beyond one row while still incrementing the counter.

Problems

Single point of failure: If the database goes down, the entire system stops generating IDs
Performance bottleneck: Every ID generation requires a network round-trip to the database (~1-5ms). At 10K IDs/sec, that's 10K DB writes per second per server
Not horizontally scalable: You can't simply "add more databases" — they'd generate conflicting IDs
Availability: Even with a replica, failover takes time and risks duplicate IDs during split-brain scenarios

Requirement	DB Auto-Increment
Globally unique	✅ Yes (serial counter)
64-bit numeric	✅ Yes (BIGINT)
Sortable by time	✅ Perfectly sortable
No coordination	❌ All servers must coordinate through one DB
High availability	❌ Single point of failure

Approach 3: Multi-Master Replication

To address the single-point-of-failure problem, use N database servers, each incrementing by N instead of by 1.

-- Server 1 (of 3): generates 1, 4, 7, 10, 13, ...
SET @@auto_increment_increment = 3;
SET @@auto_increment_offset = 1;

-- Server 2 (of 3): generates 2, 5, 8, 11, 14, ...
SET @@auto_increment_increment = 3;
SET @@auto_increment_offset = 2;

-- Server 3 (of 3): generates 3, 6, 9, 12, 15, ...
SET @@auto_increment_increment = 3;
SET @@auto_increment_offset = 3;

With N=3 servers, IDs from server k are: k, k+N, k+2N, k+3N, …

Problems

Not time-sortable across servers: Server 1 might generate ID 7 after Server 2 generates ID 8, breaking time ordering
Hard to scale: Adding a 4th server requires changing the increment on all existing servers from 3 to 4 — a risky, coordinated operation on running systems
IDs are not uniformly distributed: If server 2 is heavily loaded, there will be "gaps" in the sequence where server 2's IDs cluster
Still requires network call: Each ID generation still needs a DB round-trip

The fundamental insight: All three approaches above share a common flaw: they require coordination (either a central authority or shared configuration). The breakthrough of Snowflake is eliminating coordination entirely by encoding identity and time directly into the ID bits.

Approach 4: Twitter Snowflake (The Solution)

In 2010, Twitter engineers faced this exact problem: they needed to generate hundreds of thousands of unique IDs per second across a distributed fleet, with IDs that were 64-bit, time-sortable, and generated with zero coordination. Their solution — Snowflake — became the gold standard for distributed ID generation.

The genius of Snowflake is its bit-packing strategy: it divides a 64-bit integer into four fields, each encoding a different piece of information:

┌─────┬───────────────────────────────────────────┬──────────┬──────────┬────────────────┐
│  0  │            41 bits: timestamp              │  5 bits  │  5 bits  │   12 bits      │
│sign │        (ms since custom epoch)             │datacenter│ machine  │   sequence     │
└─────┴───────────────────────────────────────────┴──────────┴──────────┴────────────────┘
 bit 63                                                                              bit 0

Bit-by-Bit Breakdown

Bit 63: Sign Bit (1 bit)

Always 0. This ensures the ID is always a positive signed 64-bit integer. Languages like Java use signed longs, so this avoids negative-number confusion.

Bits 62–22: Timestamp (41 bits)

Milliseconds since a custom epoch. Twitter's epoch is November 4, 2010 01:42:54 UTC (Unix timestamp 1288834974657).

Maximum value: 2^41 - 1 = 2,199,023,255,551 ms
Convert to years: 2,199,023,255,551 / (1000 × 60 × 60 × 24 × 365.25) ≈ 69.73 years

Twitter's epoch: Nov 4, 2010
Expiry:          ~July 2080

If you use Unix epoch (Jan 1, 1970):
Expiry:          ~September 2039  ← much sooner!

Why a custom epoch? Using the Unix epoch wastes 40 years of timestamp space (1970–2010). By choosing an epoch close to your system's launch date, you maximize the usable range. You get the same 69 years of headroom but starting from when your system went live.

Bits 21–17: Datacenter ID (5 bits)

Supports 2⁵ = 32 data centers. This is assigned at deployment time and never changes.

Bits 16–12: Machine ID (5 bits)

Supports 2⁵ = 32 machines per data center. Combined with datacenter ID: 32 × 32 = 1,024 unique worker instances.

Bits 11–0: Sequence Number (12 bits)

A per-machine counter that increments for each ID generated within the same millisecond. Supports 2¹² = 4,096 IDs per millisecond per machine.

4,096 IDs/ms × 1,000 ms/s = 4,096,000 IDs per second per machine
With 1,024 machines: 4,096,000 × 1,024 ≈ 4.19 billion IDs per second (total)

Generation Algorithm

class SnowflakeGenerator:
    EPOCH = 1288834974657  # Twitter epoch: Nov 4, 2010

    DATACENTER_BITS = 5
    MACHINE_BITS    = 5
    SEQUENCE_BITS   = 12

    MAX_DATACENTER  = (1 << 5) - 1   # 31
    MAX_MACHINE     = (1 << 5) - 1   # 31
    MAX_SEQUENCE    = (1 << 12) - 1  # 4095

    def __init__(self, datacenter_id, machine_id):
        self.datacenter_id = datacenter_id
        self.machine_id    = machine_id
        self.sequence      = 0
        self.last_timestamp = -1

    def generate(self):
        timestamp = current_time_ms() - self.EPOCH

        if timestamp == self.last_timestamp:
            # Same millisecond: increment sequence
            self.sequence = (self.sequence + 1) & self.MAX_SEQUENCE
            if self.sequence == 0:
                # Sequence exhausted — wait for next millisecond
                timestamp = wait_next_ms(self.last_timestamp)
        else:
            # New millisecond: reset sequence
            self.sequence = 0

        if timestamp < self.last_timestamp:
            raise ClockMovedBackwardsError(
                f"Clock moved backwards by "
                f"{self.last_timestamp - timestamp}ms"
            )

        self.last_timestamp = timestamp

        # Bit-pack the ID
        id = (timestamp    << 22) | \
             (self.datacenter_id << 17) | \
             (self.machine_id    << 12) | \
             self.sequence

        return id

Concrete Example

Let's trace through a real ID generation:

Given:
  current_time  = 1713456000000 ms (April 18, 2026 16:00:00 UTC)
  custom_epoch  = 1288834974657 ms (Twitter epoch)
  datacenter_id = 7  (binary: 00111)
  machine_id    = 13 (binary: 01101)
  sequence      = 42 (binary: 000000101010)

Step 1: Compute timestamp offset
  timestamp = 1713456000000 - 1288834974657 = 424621025343 ms
  binary: 0110001011010111010001101011110011111111 (41 bits)

Step 2: Bit-pack
  [0][0110001011010111010001101011110011111111][00111][01101][000000101010]
   ↑                    ↑                        ↑      ↑         ↑
  sign           timestamp (41)                 DC(5) Mach(5)  Seq(12)

Step 3: As decimal
  id = (424621025343 << 22)
     | (7 << 17)
     | (13 << 12)
     | 42
     = 1,781,167,050,286,526,506

Verify sortability: any ID generated in the NEXT millisecond will have
timestamp = 424621025344, making it strictly larger regardless of
datacenter, machine, or sequence values.

▶ Snowflake ID Bit Layout

Step through each segment of a 64-bit Snowflake ID. Watch how timestamp, datacenter, machine, and sequence bits are packed, then see how IDs from different machines remain time-sortable.

Clock Synchronization & NTP

Snowflake's correctness depends on monotonically increasing timestamps. But clocks in distributed systems are notoriously unreliable. This is where NTP (Network Time Protocol) comes in.

How NTP Works

NTP synchronizes machine clocks with authoritative time servers in a hierarchy (strata):

Stratum 0: Atomic clocks, GPS receivers — the ultimate time source
Stratum 1: Servers directly connected to Stratum 0 devices
Stratum 2: Servers synchronized to Stratum 1 (most production servers)
Stratum 3–15: Progressively less accurate

NTP achieves accuracy of 1-10ms in typical LAN environments and 10-100ms over the internet. For Snowflake, this is usually sufficient since the timestamp granularity is 1ms.

Clock Skew: The Danger

The nightmare scenario: NTP adjusts your clock backwards. If the current timestamp is suddenly less than last_timestamp, Snowflake would either:

Generate duplicate IDs (if it reuses a past timestamp + sequence combination)
Throw an error and stop (what Twitter's Snowflake actually does)

// From Twitter's Snowflake source (Scala):
if (timestamp < lastTimestamp) {
  log.error("clock is moving backwards. Rejecting requests "
    + "until %d.", lastTimestamp)
  throw new InvalidSystemClock(
    "Clock moved backwards. Refusing to generate id for "
    + "%d milliseconds".format(lastTimestamp - timestamp))
}

Mitigating Clock Skew

Use NTP with -x flag: This tells ntpd to slew the clock (gradually adjust) instead of stepping (jumping) for offsets under 128ms. Slewing avoids backwards jumps
Monitor NTP offset: Alert if offset exceeds 10ms — something is wrong
Tolerate small backwards jumps: If the jump is < 5ms, you can spin-wait. If larger, refuse and alert
Use hardware timestamps: Google's TrueTime API uses GPS + atomic clocks to provide a bounded time interval — but this is expensive hardware

Google Spanner's approach: Instead of a single timestamp, TrueTime returns an interval [earliest, latest]. Spanner waits out the uncertainty window before committing — guaranteeing external consistency. This is overkill for ID generation but fascinating for distributed transactions.

Instagram's Approach

Instagram needed time-sortable 64-bit IDs but wanted to use PostgreSQL rather than a separate Snowflake service. Their solution is a clever stored function that runs inside the database:

-- Instagram's ID generation (simplified):
-- 41 bits: timestamp (ms since Jan 1, 2011)
-- 13 bits: shard ID (logical shard, from 0 to 8191)
-- 10 bits: auto-incrementing sequence per shard per ms

CREATE OR REPLACE FUNCTION next_id(
    shard_id INT DEFAULT 0,
    OUT result BIGINT
) AS $$
DECLARE
    epoch BIGINT := 1293840000000;  -- Jan 1, 2011 UTC
    seq_id BIGINT;
    now_ms BIGINT;
BEGIN
    SELECT nextval('table_id_seq') % 1024 INTO seq_id;
    now_ms := (EXTRACT(EPOCH FROM clock_timestamp()) * 1000)::BIGINT - epoch;
    result := (now_ms << 23)
            | (shard_id << 10)
            | seq_id;
END;
$$ LANGUAGE plpgsql;

Key Differences from Snowflake

Aspect	Twitter Snowflake	Instagram ID
Where it runs	Standalone service (Thrift RPC)	Inside PostgreSQL (PL/pgSQL)
Timestamp bits	41	41
Location encoding	5 DC + 5 machine = 1,024 workers	13 shard bits = 8,192 shards
Sequence bits	12 (4,096/ms/machine)	10 (1,024/ms/shard)
Dependency	Separate Thrift service	PostgreSQL (already there)
Network call	Yes (to Snowflake service)	No (runs inside the DB)

Instagram's approach is elegant for PostgreSQL-heavy architectures: you don't need a separate service, and the ID is generated atomically with the row insertion. The trade-off is that it's tightly coupled to PostgreSQL.

Sonyflake

Sony's Sonyflake is an alternative inspired by Snowflake, optimized for longer lifetimes at the cost of fewer machines and lower per-machine throughput:

┌───────────────────────────────────────────┬──────────────────┬────────────┐
│       39 bits: timestamp                  │   8 bits         │ 16 bits    │
│    (10ms units since custom epoch)        │   sequence       │ machine ID │
└───────────────────────────────────────────┴──────────────────┴────────────┘

Key Design Decisions

Timestamp in 10ms units (not 1ms): 2³⁹ × 10ms = 5,497,558,138,880ms ≈ 174 years (vs. Snowflake's 69 years)
8-bit sequence: Only 256 IDs per 10ms window = 25,600 IDs/sec per machine
16-bit machine ID: Supports 2¹⁶ = 65,536 machines (vs. Snowflake's 1,024)
Machine ID from private IP: The lower 16 bits of the machine's private IPv4 address — no configuration needed

Sonyflake trades per-machine throughput for longer lifetime and more machines. It's ideal for IoT or microservice architectures with many small instances.

ULID (Universally Unique Lexicographically Sortable Identifier)

ULID is a 128-bit identifier (like UUID) that's time-sortable and string-sortable. While it doesn't meet our 64-bit requirement, it's widely used and worth understanding:

01ARZ3NDEKTSV4RRFFQ69G5FAV
         │                    │
    10 chars: timestamp   16 chars: randomness
    (48 bits, ms precision)  (80 bits)

Structure: 128 bits total
  ┌────────────────────────────┬──────────────────────────────────────────┐
  │  48 bits: Unix timestamp   │        80 bits: randomness              │
  │  (ms, ~8,919 years)        │   (cryptographically secure)            │
  └────────────────────────────┴──────────────────────────────────────────┘

ULID Advantages

Lexicographically sortable: The Crockford Base32 encoding preserves time ordering even as a string
Case insensitive: No lowercase letters in Base32
No special characters: URL-safe, no dashes
Monotonic within millisecond: If multiple ULIDs are generated in the same millisecond, the random portion is incremented, ensuring monotonicity

ULID vs UUID

Feature	UUIDv4	UUIDv7	ULID
Size	128 bits	128 bits	128 bits
Time-sortable	❌	✅	✅
String representation	36 chars (hex + dash)	36 chars (hex + dash)	26 chars (Base32)
String-sortable	❌	✅	✅
Collision resistance	122 random bits	62 random bits	80 random bits

Comprehensive Comparison

Here is the definitive comparison of all approaches discussed:

Approach	Bits	Time-Sortable	Coordination	Throughput/Machine	Lifetime
UUIDv4	128	❌ Random	None	Unlimited	Infinite
DB Auto-Inc	64	✅ Perfect	Central DB	~10K/sec (DB limited)	292 years
Multi-Master	64	⚠️ Per-server only	Increment config	~10K/sec (DB limited)	292 years / N
Snowflake	64	✅ ms-precision	None	4.096M/sec	~69 years
Instagram	64	✅ ms-precision	None (inside PG)	1,024/ms/shard	~69 years
Sonyflake	63	✅ 10ms-precision	None	25.6K/sec	~174 years
ULID	128	✅ ms-precision	None	Unlimited (random)	~8,919 years

The winner for our requirements is Snowflake: 64-bit ✅, time-sortable ✅, no coordination ✅, ~4M IDs/sec per machine ✅. If you don't need 64-bit, ULID or UUIDv7 are excellent modern alternatives.

▶ ID Generation Flow: Multiple Servers

Watch three servers in different data centers independently generate unique Snowflake IDs with no coordination. Notice how IDs remain time-sortable across all machines.

Bit Manipulation Deep Dive

Let's walk through the exact bitwise operations used to construct and deconstruct a Snowflake ID.

Constructing an ID

// Given values:
timestamp     = 424621025343  // ms since epoch
datacenter_id = 7             // 0b00111
machine_id    = 13            // 0b01101
sequence      = 42            // 0b000000101010

// Step 1: Shift timestamp left by 22 bits
//   (to make room for 5+5+12 = 22 bits of DC+machine+seq)
timestamp << 22
= 424621025343 << 22
= 424621025343 × 4194304
= 1781167050235084800

In binary:
  0|0110001011010111010001101011110011111111|0000000000000000000000
                 41-bit timestamp            22 zero-bits (room for rest)

// Step 2: Shift datacenter left by 17 bits (room for 5+12)
datacenter_id << 17
= 7 << 17
= 917504

In binary:
  00000000000000000000000000000000000000000|00111|00000000000000000
                                             DC    17 zero-bits

// Step 3: Shift machine left by 12 bits (room for 12)
machine_id << 12
= 13 << 12
= 53248

In binary:
  000000000000000000000000000000000000000000000000|01101|000000000000
                                                    Mach  12 zero-bits

// Step 4: OR everything together
id = (timestamp << 22) | (datacenter_id << 17) | (machine_id << 12) | sequence
   = 1781167050235084800 | 917504 | 53248 | 42
   = 1781167050236055594

Binary result:
  0|0110001011010111010001101011110011111111|00111|01101|000000101010
  S                timestamp                  DC   Mach    Sequence

Deconstructing an ID

// Given: id = 1781167050236055594

// Extract sequence (lowest 12 bits)
sequence = id & 0xFFF              // = id & 4095 = 42

// Extract machine ID (bits 12-16)
machine = (id >> 12) & 0x1F       // = (id >> 12) & 31 = 13

// Extract datacenter ID (bits 17-21)
datacenter = (id >> 17) & 0x1F    // = (id >> 17) & 31 = 7

// Extract timestamp (bits 22-62)
timestamp = (id >> 22) & 0x1FFFFFFFFFF  // = 424621025343

// Convert back to absolute time
absolute_time = timestamp + EPOCH
              = 424621025343 + 1288834974657
              = 1713456000000
              // = April 18, 2026 16:00:00 UTC ✓

This ability to extract metadata from the ID itself is incredibly powerful. Given any Snowflake ID, you can determine:

When it was created (to the millisecond)
Where it was created (datacenter + machine)
What order it was created in (within that millisecond on that machine)

Bitmask Reference

Component     Bits    Mask (hex)        Mask (decimal)
─────────────────────────────────────────────────────
Sequence      0-11    0x000000000FFF    4095
Machine ID   12-16    0x000000001F000   126976  (after >>12: 0x1F = 31)
Datacenter   17-21    0x00000003E0000   2523136 (after >>17: 0x1F = 31)
Timestamp    22-62    0x7FFFFFFFFFC00000         (after >>22: 0x1FFFFFFFFFF)
Sign bit     63       0x8000000000000000

Production Considerations

Machine ID Assignment

How do you assign unique machine IDs to each server? Several strategies:

ZooKeeper / etcd: Each server registers at startup, gets a unique sequential ID. Most reliable approach
Configuration management: Assign machine IDs in deployment configs (Kubernetes labels, environment variables)
IP-based (Sonyflake): Use lower bits of private IP. Works without any coordination but risks collisions if IPs are recycled
AWS Instance ID: Hash the EC2 instance ID to 10 bits. Unique within an account

Sequence Exhaustion

When 4,096 IDs are generated in the same millisecond (sequence wraps to 0), Snowflake blocks until the next millisecond. This is a spin-wait:

def wait_next_ms(last_ts):
    ts = current_time_ms() - EPOCH
    while ts <= last_ts:
        ts = current_time_ms() - EPOCH
    return ts

# Worst case: wait up to 1ms (typically microseconds)
# At 4,096 IDs/ms, this means sustained throughput
# of 4,096,000 IDs/sec per machine

Epoch Selection

Choose your custom epoch carefully:

Set it to your system's launch date (or slightly before, for testing)
Document it prominently — it's a critical configuration constant
Never change it after deployment (all existing IDs would become undecodable)
Consider using a round number (e.g., midnight Jan 1 of your launch year)

ID as a Debugging Tool

Because Snowflake IDs encode metadata, they're invaluable for debugging:

# "When was this order created?"
order_id = 1781167050236055594
created_at = extract_timestamp(order_id) + EPOCH
# → April 18, 2026 16:00:00 UTC

# "Which server processed it?"
dc = extract_datacenter(order_id)    # → 7
machine = extract_machine(order_id)  # → 13
# → Datacenter 7, Machine 13

# "Was it generated under high load?"
seq = extract_sequence(order_id)     # → 42
# → 43rd ID in that millisecond (0-indexed). If seq > 3000,
#   this machine was under heavy load.

Customizing the Bit Layout

Snowflake's 1+41+5+5+12 layout is not sacred. You can reallocate bits based on your needs:

Scenario	Layout	Trade-off
More machines	1+41+13+9	8,192 machines but only 512 IDs/ms/machine
Longer lifetime	1+45+8+10	~1,114 years but only 256 machines, 1,024/ms
Higher throughput	1+41+6+16	65,536 IDs/ms/machine but only 64 machines
Single datacenter	1+41+10+12	1,024 machines, 4,096/ms — no DC bits needed

The constraint is always: 1 + timestamp + location + sequence = 64 bits. Every bit you give to one field is taken from another.

When to Use What

Use Snowflake when:

You need 64-bit, time-sortable, numeric IDs
You have a known, bounded set of machines/data centers
Per-machine throughput of 4M IDs/sec is sufficient
You can tolerate NTP-level clock accuracy

Use UUIDv4 when:

You don't need time-sorting
128 bits is acceptable
You need zero configuration — every process generates independently
You want maximum collision resistance

Use ULID when:

You want time-sortable identifiers in string form
You need compact, URL-safe string representations
You want a UUID replacement that sorts lexicographically

Use Instagram's approach when:

Your stack is already PostgreSQL-centric
You want ID generation co-located with row insertion
You need many shards (up to 8,192)

Use Sonyflake when:

You have many machines (up to 65K) but lower throughput per machine
You want a longer system lifetime (174 years)
IP-based machine ID assignment is acceptable

Interview Walkthrough

Here's the optimal structure for answering this problem in a 35-minute system design interview:

Minutes 0–5: Clarify Requirements

Ask: "Does the ID need to be numeric? What size?" → 64-bit numeric
Ask: "Does it need to be time-sortable?" → Yes, roughly
Ask: "What's the expected throughput?" → 10K IDs/sec/server
Ask: "How many data centers / servers?" → Multiple DCs, thousands of servers

Minutes 5–10: Enumerate Approaches

Mention UUID (too big, not sortable), DB auto-increment (SPOF), multi-master (scaling issues)
Propose Snowflake as the solution — show you know it exists

Minutes 10–25: Deep Dive into Snowflake

Draw the bit layout diagram. Explain each field
Walk through the generation algorithm with concrete numbers
Discuss clock synchronization and NTP
Explain sequence exhaustion handling
Discuss machine ID assignment (ZooKeeper)

Minutes 25–35: Production Considerations

Clock skew mitigation strategies
Customizing the bit layout for specific needs
Mention Instagram, Sonyflake, ULID as alternatives
ID as a debugging/introspection tool

Interviewer bonus points: Extract metadata from a raw ID to prove you understand the bit manipulation. "Given ID 1781167050236055594, this was generated on April 18 2026 at datacenter 7, machine 13, with sequence number 42." This demonstrates real fluency.

Key Takeaways

Bit-packing eliminates coordination. By encoding time, location, and sequence into the ID itself, each machine can generate IDs independently with zero network calls
Time in the high-order bits ensures sortability. Since the timestamp occupies the most significant bits, IDs are naturally ordered by time regardless of which machine generated them
Custom epochs maximize lifetime. Don't waste bits representing years before your system existed — start counting from your launch date
The bit layout is configurable. Snowflake's 1+41+5+5+12 is a starting point. Adjust based on your number of machines, required throughput, and desired lifetime
Clock reliability is the Achilles' heel. NTP is "good enough" for most systems, but monitor clock skew aggressively and handle backwards jumps gracefully
IDs contain metadata. A well-designed ID is also a debugging tool — extract when, where, and in what order it was created