← All Posts
High Level Design Series · Architecture Patterns · Part 5· Post 41 of 70

Serverless Architecture

What Is Serverless?

"Serverless" doesn't mean no servers — it means you don't manage them. The cloud provider handles provisioning, scaling, patching, and capacity planning. You write code; they run it. Two broad categories fall under the serverless umbrella:

CategoryWhat It MeansExamples
FaaS (Function as a Service)Deploy individual functions that execute in response to events. No long-running process.AWS Lambda, Google Cloud Functions, Azure Functions, Cloudflare Workers
BaaS (Backend as a Service)Fully managed backend components — auth, database, file storage, push notifications — exposed via APIs.Firebase (Firestore, Auth, Storage), Auth0, Supabase, AWS Amplify

Modern serverless applications usually combine both: FaaS for custom business logic and BaaS for commodity services like authentication and storage.

FaaS — Function as a Service

Major Providers at a Glance

FeatureAWS LambdaGoogle Cloud FunctionsAzure Functions
Max timeout15 min60 min (2nd gen)Unlimited (Premium plan)
Max memory10,240 MB32,768 MB14,336 MB
Max package size50 MB zipped / 250 MB unzipped (10 GB with container images)100 MB source / container imagesNo hard limit (Consumption: ~1.5 GB)
LanguagesPython, Node.js, Java, Go, .NET, Ruby, Rust (custom runtime)Node.js, Python, Go, Java, .NET, Ruby, PHPC#, JavaScript, Python, Java, PowerShell, TypeScript
Concurrency1,000 default (can raise to 10K+)Up to 1,000 per function (2nd gen)200 per instance (Premium)
Free tier1M requests + 400K GB-s/mo2M invocations + 400K GB-s/mo1M requests + 400K GB-s/mo

AWS Lambda Configuration Deep Dive

A real-world Lambda function definition using the Serverless Framework (serverless.yml):

service: image-processor

provider:
  name: aws
  runtime: python3.12
  region: us-east-1
  memorySize: 1024          # MB — also determines CPU allocation
  timeout: 30               # seconds (max 900)
  architecture: arm64       # Graviton2 — 20% cheaper, often faster
  environment:
    BUCKET_NAME: ${self:custom.bucketName}
    TABLE_NAME: ${self:custom.tableName}
  iam:
    role:
      statements:
        - Effect: Allow
          Action:
            - s3:GetObject
            - s3:PutObject
          Resource: arn:aws:s3:::${self:custom.bucketName}/*
        - Effect: Allow
          Action:
            - dynamodb:PutItem
            - dynamodb:GetItem
          Resource: arn:aws:dynamodb:us-east-1:*:table/${self:custom.tableName}

functions:
  processImage:
    handler: handler.process_image
    memorySize: 2048         # override provider default
    timeout: 60
    reservedConcurrency: 100 # max concurrent executions
    provisionedConcurrency: 5 # keep 5 warm instances
    events:
      - s3:
          bucket: ${self:custom.bucketName}
          event: s3:ObjectCreated:*
          rules:
            - prefix: uploads/
            - suffix: .jpg
    layers:
      - arn:aws:lambda:us-east-1:770693421928:layer:Klayers-p312-Pillow:1

  getImage:
    handler: handler.get_image
    memorySize: 256
    timeout: 10
    events:
      - httpApi:
          path: /images/{id}
          method: GET

custom:
  bucketName: my-image-bucket-${sls:stage}
  tableName: image-metadata-${sls:stage}

The corresponding Python handler:

import json
import boto3
import os
from PIL import Image
from io import BytesIO
from datetime import datetime

s3 = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(os.environ['TABLE_NAME'])

# This code runs once per container lifecycle (init phase)
print("Cold start: initializing clients and dependencies")

def process_image(event, context):
    """Triggered when a .jpg is uploaded to uploads/ prefix."""
    record = event['Records'][0]
    bucket = record['s3']['bucket']['name']
    key = record['s3']['object']['key']
    size = record['s3']['object']['size']

    # Download original image
    response = s3.get_object(Bucket=bucket, Key=key)
    img = Image.open(BytesIO(response['Body'].read()))

    # Generate thumbnail (320x320 max)
    img.thumbnail((320, 320), Image.LANCZOS)
    buffer = BytesIO()
    img.save(buffer, 'JPEG', quality=85)
    buffer.seek(0)

    # Upload thumbnail
    thumb_key = key.replace('uploads/', 'thumbnails/')
    s3.put_object(
        Bucket=bucket, Key=thumb_key,
        Body=buffer.getvalue(),
        ContentType='image/jpeg'
    )

    # Store metadata
    table.put_item(Item={
        'image_id': key.split('/')[-1].split('.')[0],
        'original_key': key,
        'thumbnail_key': thumb_key,
        'original_size': size,
        'width': img.width,
        'height': img.height,
        'processed_at': datetime.utcnow().isoformat(),
        'remaining_ms': context.get_remaining_time_in_millis()
    })

    return {
        'statusCode': 200,
        'body': json.dumps({
            'message': 'Thumbnail created',
            'thumbnail': thumb_key
        })
    }

def get_image(event, context):
    """GET /images/{id} — return metadata from DynamoDB."""
    image_id = event['pathParameters']['id']
    result = table.get_item(Key={'image_id': image_id})
    if 'Item' not in result:
        return {'statusCode': 404, 'body': '{"error":"not found"}'}
    return {
        'statusCode': 200,
        'body': json.dumps(result['Item'], default=str)
    }
Memory = CPU allocation in Lambda. Lambda ties CPU to memory linearly. At 1,769 MB you get 1 full vCPU. At 10,240 MB you get 6 vCPUs. If your function is CPU-bound (image processing, compression, ML inference), increasing memory also increases CPU — and often reduces total cost because the function finishes faster.

BaaS — Backend as a Service

BaaS eliminates entire backend components by providing them as managed APIs:

ServiceWhat It ReplacesKey Features
Firebase FirestoreDatabase + real-time syncNoSQL document DB, real-time listeners, offline persistence, security rules
Firebase AuthAuth server + session mgmtEmail/password, OAuth (Google, GitHub, Apple), phone auth, anonymous auth
Auth0Enterprise identity platformSSO, MFA, RBAC, SAML/OIDC, machine-to-machine tokens, passwordless
SupabasePostgres + REST API + authOpen-source Firebase alternative, row-level security, real-time subscriptions
AWS AmplifyFull backendGraphQL API (AppSync), auth (Cognito), storage (S3), hosting, CI/CD

A typical BaaS pattern: a React or mobile app talks directly to Firebase for auth and real-time data. When custom logic is needed (e.g., payment processing, image resize), a Cloud Function handles it. No Express server, no database management, no infrastructure to maintain.

Execution Model

Understanding the serverless execution lifecycle is critical for performance tuning:

The Request Lifecycle

Event Trigger (API Gateway, S3, SQS, etc.)
        │
        ▼
┌─────────────────────────────────────────────────┐
│  Is a warm container available?                 │
│     YES → Skip to "Invoke Handler"              │
│     NO  → COLD START                            │
│           1. Provision execution environment     │  ~100-300ms
│           2. Download deployment package         │  ~50-200ms
│           3. Start runtime (JVM, Node, Python)   │  ~50-500ms
│           4. Run initialization code (imports,   │  ~varies
│              SDK clients, DB connections)         │
└─────────────────────────────────────────────────┘
        │
        ▼
┌─────────────────────────────────────────────────┐
│  Invoke Handler                                 │
│  - Receive event + context                      │
│  - Execute business logic                       │
│  - Return response                              │
│  Duration: billed per 1ms (min 1ms)             │
└─────────────────────────────────────────────────┘
        │
        ▼
┌─────────────────────────────────────────────────┐
│  Container kept warm (~5-15 minutes)            │
│  - Reused for subsequent invocations            │
│  - Init code NOT re-run                         │
│  - Handler variables persist in memory          │
│  - /tmp directory (10 GB) persists              │
│  No requests → container destroyed              │
└─────────────────────────────────────────────────┘

▶ Serverless Execution Flow

Step through the lifecycle: cold start → execution → warm reuse → container teardown. Watch the timing comparison.

Key Execution Details

# GOOD — initialized once per container lifecycle
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('my-table')

def handler(event, context):
    # reuses the existing DynamoDB connection
    return table.get_item(Key={'id': event['id']})

# BAD — creates a new client on every invocation
def handler_bad(event, context):
    dynamodb = boto3.resource('dynamodb')    # 50-100ms overhead per call!
    table = dynamodb.Table('my-table')
    return table.get_item(Key={'id': event['id']})

Cold Starts — The Serverless Tax

Cold starts are the most discussed limitation of serverless. They occur when a new execution environment must be created to serve a request. Here are real-world benchmarks:

Cold Start Benchmarks by Runtime

RuntimeCold Start (p50)Cold Start (p99)Warm InvocationNotes
Python 3.12~180 ms~400 ms~2-5 msGreat for most workloads
Node.js 20~170 ms~350 ms~2-4 msV8 snapshots help
Go 1.x~80 ms~180 ms~1-2 msSingle static binary — fastest cold starts
Rust (custom runtime)~12 ms~30 ms<1 msMinimal runtime overhead
Java 21 (no SnapStart)~3,000 ms~6,000 ms~3-8 msJVM startup is brutal
Java 21 (SnapStart)~200 ms~500 ms~3-8 msCRaC-based snapshot — 10-15× improvement
.NET 8 (AOT)~250 ms~500 ms~2-5 msNative AOT avoids CLR startup
What increases cold start time?
  • Deployment package size: 50 MB zip → ~150ms extra. Use layers judiciously and strip unnecessary files.
  • VPC attachment: Used to add 8–10 seconds (ENI creation). Now ~1 second with Hyperplane ENIs, but still significant.
  • Init code complexity: Heavy imports (pandas, numpy, boto3) or DB connection pooling can add 200-500ms.
  • Memory allocation: More memory = more CPU = slightly faster init. The 1,769 MB sweet spot (1 vCPU) is common.

Mitigating Cold Starts

1. Provisioned Concurrency

Pre-warms a specified number of execution environments. They're always ready — zero cold starts for those instances.

# AWS CLI — set provisioned concurrency
aws lambda put-provisioned-concurrency-config \
  --function-name my-api-handler \
  --qualifier prod \           # alias or version (not $LATEST)
  --provisioned-concurrent-executions 50

# Cost: ~$0.015/GB-hour for provisioned concurrency
# 50 instances × 512MB × 24h × 30d = 50 × 0.5 × 720 = 18,000 GB-hours
# Monthly cost: 18,000 × $0.015 = $270/month
# Plus $0.035 per 100ms of actual execution (discounted from normal rate)

2. Keep-Warm (Ping) Strategy

# CloudWatch scheduled event — invoke every 5 minutes
# serverless.yml
functions:
  apiHandler:
    handler: handler.main
    events:
      - httpApi: 'GET /api/{proxy+}'
      - schedule:
          rate: rate(5 minutes)
          input:
            source: 'serverless-warmup'

# In handler:
def main(event, context):
    if event.get('source') == 'serverless-warmup':
        return {'statusCode': 200, 'body': 'warm'}
    # ... actual logic

Limitation: A keep-warm ping only keeps one container warm. If you need 10 concurrent warm instances, you need to fire 10 concurrent pings — which is fragile. Provisioned concurrency is the robust solution.

3. SnapStart (Java)

# Enable SnapStart for Java Lambda functions
aws lambda update-function-configuration \
  --function-name my-java-function \
  --snap-start ApplyOn=PublishedVersions

# Takes a CRaC snapshot after init, restores from it on cold start
# Reduces Java cold start from ~3-6 seconds to ~200-500ms

4. Minimize Package Size

# Python — exclude unnecessary files
package:
  individually: true
  patterns:
    - '!node_modules/**'
    - '!tests/**'
    - '!.git/**'
    - '!**/*.pyc'
    - '!**/__pycache__/**'

# Use Lambda Layers for large dependencies
# Pillow layer: ~20MB instead of bundling in each function
# boto3 is pre-installed — don't include it in your package!

Event Sources

Serverless functions are event-driven. Understanding the invocation models is crucial:

SourceInvocation TypeRetry BehaviorUse Case
API GatewaySynchronousNo retries (caller retries)REST/HTTP APIs, WebSockets
S3 EventsAsynchronous2 retries, then DLQFile upload processing, ETL
SQSPolling (event source mapping)Visibility timeout, DLQ after N failsWork queues, decoupled processing
DynamoDB StreamsPolling (event source mapping)Retries until expiry (24h), blocks shardChange data capture, materialized views
KinesisPolling (event source mapping)Retries until data expires (7d default)Real-time streaming, analytics
SNSAsynchronous3 retries (immediate, 1s, 2s)Fan-out, notifications
EventBridgeAsynchronousConfigurable retries + DLQEvent bus, cross-service events
CloudWatch Events/CronAsynchronous2 retriesScheduled tasks, cron jobs

Invocation Model Details

# Synchronous — caller waits for response
response = lambda_client.invoke(
    FunctionName='my-function',
    InvocationType='RequestResponse',  # synchronous
    Payload=json.dumps({'key': 'value'})
)
result = json.loads(response['Payload'].read())

# Asynchronous — fire and forget, Lambda handles retries
lambda_client.invoke(
    FunctionName='my-function',
    InvocationType='Event',            # async — returns 202 immediately
    Payload=json.dumps({'key': 'value'})
)

# Event Source Mapping (polling) — Lambda polls SQS/Kinesis/DynamoDB
# and invokes your function with batches of records
aws lambda create-event-source-mapping \
  --function-name process-orders \
  --event-source-arn arn:aws:sqs:us-east-1:123:order-queue \
  --batch-size 10 \
  --maximum-batching-window-in-seconds 5 \
  --function-response-types ReportBatchItemFailures

▶ Event-Driven Serverless Pipeline

Follow an image upload through S3 → Lambda → DynamoDB → SNS notification.

Step Functions & Orchestration

Individual Lambda functions are great for simple tasks, but real workflows involve sequences, branches, retries, and parallel execution. AWS Step Functions provides a state machine abstraction for orchestrating serverless workflows.

State Machine Definition (ASL — Amazon States Language)

{
  "Comment": "Image processing pipeline",
  "StartAt": "ValidateImage",
  "States": {
    "ValidateImage": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123:function:validate-image",
      "Next": "CheckFormat",
      "Retry": [
        {
          "ErrorEquals": ["States.TaskFailed"],
          "IntervalSeconds": 2,
          "MaxAttempts": 3,
          "BackoffRate": 2.0
        }
      ],
      "Catch": [
        {
          "ErrorEquals": ["ValidationError"],
          "Next": "RejectImage"
        }
      ]
    },
    "CheckFormat": {
      "Type": "Choice",
      "Choices": [
        {
          "Variable": "$.format",
          "StringEquals": "RAW",
          "Next": "ConvertToJPEG"
        }
      ],
      "Default": "ProcessInParallel"
    },
    "ConvertToJPEG": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123:function:convert-to-jpeg",
      "Next": "ProcessInParallel"
    },
    "ProcessInParallel": {
      "Type": "Parallel",
      "Branches": [
        {
          "StartAt": "GenerateThumbnail",
          "States": {
            "GenerateThumbnail": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:us-east-1:123:function:thumbnail",
              "End": true
            }
          }
        },
        {
          "StartAt": "ExtractMetadata",
          "States": {
            "ExtractMetadata": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:us-east-1:123:function:metadata",
              "End": true
            }
          }
        },
        {
          "StartAt": "RunModeration",
          "States": {
            "RunModeration": {
              "Type": "Task",
              "Resource": "arn:aws:lambda:us-east-1:123:function:moderate",
              "End": true
            }
          }
        }
      ],
      "Next": "StoreResults"
    },
    "StoreResults": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123:function:store-results",
      "Next": "NotifyUser"
    },
    "NotifyUser": {
      "Type": "Task",
      "Resource": "arn:aws:states:::sns:publish",
      "Parameters": {
        "TopicArn": "arn:aws:sns:us-east-1:123:image-notifications",
        "Message.$": "$.message"
      },
      "End": true
    },
    "RejectImage": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123:function:reject-image",
      "End": true
    }
  }
}

Step Functions: Standard vs Express

FeatureStandardExpress
Max duration1 year5 minutes
Execution guaranteeExactly-onceAt-least-once
Execution historyFull audit trail, visual debuggerCloudWatch Logs only
Pricing$0.025 per 1,000 state transitionsBased on executions + duration
Best forLong-running workflows, human approval stepsHigh-volume, short-lived event processing

Cost Model

Serverless pricing is granular and can be surprisingly cheap at low-to-moderate scale — or shockingly expensive at high throughput.

AWS Lambda Pricing Breakdown

ComponentPriceFree Tier
Requests$0.20 per 1M requests1M requests/month
Duration (x86)$0.0000166667 per GB-second400,000 GB-seconds/month
Duration (ARM/Graviton)$0.0000133334 per GB-second (20% cheaper)400,000 GB-seconds/month
Provisioned Concurrency$0.0000041667 per GB-second (provisioned) + $0.0000097222 per GB-second (execution)None

Cost Calculation Examples

Scenario 1: Light API (Startup)

Requests:  500,000/month
Memory:    256 MB
Avg time:  100ms
Arch:      ARM (Graviton)

Request cost:  0 (under 1M free tier)
Duration:      500,000 × 0.1s × 0.25 GB = 12,500 GB-seconds
               12,500 - 400,000 (free tier) = 0 (under free tier!)

Total: $0/month  ← Genuinely free for light workloads

Scenario 2: Moderate API (Growing Product)

Requests:  10,000,000/month  (10M)
Memory:    512 MB
Avg time:  200ms
Arch:      ARM

Request cost:  (10M - 1M) × $0.20/1M = $1.80
Duration:      10M × 0.2s × 0.5 GB = 1,000,000 GB-s
               (1,000,000 - 400,000) × $0.0000133334 = $8.00

Total: ~$9.80/month  ← Still incredibly cheap

Scenario 3: High Traffic (At Scale)

Requests:  100,000,000/month  (100M)
Memory:    1,024 MB
Avg time:  300ms
Arch:      x86

Request cost:  (100M - 1M) × $0.20/1M = $19.80
Duration:      100M × 0.3s × 1.0 GB = 30,000,000 GB-s
               (30M - 400K) × $0.0000166667 = $493.49
API Gateway:   100M × $1.00/1M = $100.00  ← DON'T FORGET THIS!

Total: ~$613/month  ← vs ~$150/month for 2× c6g.xlarge EC2
       (EC2 wins at steady high-throughput)

Scenario 4: Where Serverless Gets Expensive

Requests:  1,000,000,000/month  (1B)
Memory:    2,048 MB
Avg time:  500ms
Arch:      x86

Request cost:  (1B - 1M) × $0.20/1M = $199.80
Duration:      1B × 0.5s × 2.0 GB = 1,000,000,000 GB-s
               (1B - 400K) × $0.0000166667 = $16,666.37
API Gateway:   1B × $1.00/1M = $1,000.00

Total: ~$17,866/month  ← At this scale, use ECS/EKS/EC2
       Equivalent EC2: ~$2,000-3,000/month
The Serverless Cost Crossover Point: Serverless is cheapest when you have variable, spiky, or low traffic. Once you pass ~10-50M requests/month with consistent load, containerized solutions (ECS Fargate, EKS) become cheaper. At 100M+ steady requests, reserved EC2 instances win decisively. The decision isn't just about cost though — factor in engineering time: no patching, no scaling configuration, no capacity planning.

Limitations & Challenges

Hard Limits

LimitValueImpact
Max execution time15 minutes (Lambda)No long-running processes, batch jobs need chunking
Max concurrent executions1,000 default (account-level)Shared across ALL functions — can starve other functions
Payload size (sync)6 MB request/responseLarge files must go through S3
Payload size (async)256 KBPass S3 references, not data
/tmp storage10 GB (configurable)Ephemeral, shared across warm invocations
Environment variables4 KB totalUse SSM Parameter Store or Secrets Manager for large configs

Operational Challenges

# Defensive timeout handling
def handler(event, context):
    items = get_batch_items()
    results = []

    for item in items:
        # Check if we have enough time remaining (leave 5s buffer)
        remaining_ms = context.get_remaining_time_in_millis()
        if remaining_ms < 5000:
            # Save progress and re-enqueue remaining items
            save_checkpoint(results)
            requeue_remaining(items[len(results):])
            return {
                'statusCode': 202,
                'body': json.dumps({
                    'processed': len(results),
                    'remaining': len(items) - len(results),
                    'status': 'partial — re-queued'
                })
            }

        results.append(process_item(item))

    return {'statusCode': 200, 'body': json.dumps(results)}

When to Use Serverless

✓ Ideal Use Cases

Use CaseWhy Serverless ExcelsExample
Event processingNatural fit for event-driven model, auto-scales with event volumeS3 upload → resize image → store metadata
WebhooksSporadic traffic, pay nothing when idleGitHub/Stripe/Twilio webhook handlers
Scheduled tasksReplaces cron servers — no instance running 24/7 for a 5-minute jobNightly reports, data cleanup, health checks
APIs with variable trafficScales from 0 to thousands of concurrent requests, back to 0Startup MVP, internal tools, seasonal apps
Data transformationParallel processing of streaming dataKinesis → Lambda → Elasticsearch ingestion
Chatbots & IoTBursty, unpredictable traffic patternsAlexa skills, IoT rule actions
Prototyping & MVPsZero infrastructure cost until you have users, rapid iterationAPI + DynamoDB + S3 — full stack in serverless.yml

✗ When NOT to Use Serverless

Anti-PatternWhy It FailsBetter Alternative
Long-running processes15-min max execution time. Video transcoding, ML training, and large batch jobs time out.ECS Fargate tasks, AWS Batch, EC2
Latency-sensitive (<10ms)Cold starts add 100ms–6s of latency. Even provisioned concurrency adds overhead vs bare metal.EC2, EKS with pod pre-scaling
High-throughput steady workloadsAt 100M+ requests/month with consistent load, per-invocation billing is 5-10× more expensive than reserved capacity.ECS/EKS with auto-scaling, reserved EC2
WebSocket/persistent connectionsStateless execution model doesn't support long-lived connections natively. API Gateway WebSocket exists but is awkward.ECS with Socket.io, dedicated WebSocket servers
Complex stateful workflowsForcing state management through DynamoDB + Step Functions adds complexity that a simple server avoids.Temporal/Cadence on ECS, traditional servers
Heavy local computationMax 10 GB RAM, 6 vCPUs. Large-scale data processing, ML inference on large models, and GPU workloads are out.EC2 with GPUs, SageMaker, EMR

Decision Framework

Should I use Serverless?

1. Is execution time < 15 minutes?
   NO  → Use containers (ECS/EKS) or EC2
   YES ↓

2. Is traffic variable/spiky/unpredictable?
   YES → Strong serverless candidate ✓
   NO  ↓

3. Do you need sub-10ms latency consistently?
   YES → Use containers or bare metal
   NO  ↓

4. Is monthly request volume < 50M?
   YES → Serverless is almost certainly cheaper ✓
   NO  ↓

5. Is engineering time more valuable than compute cost?
   YES → Serverless (less ops overhead) ✓
   NO  → Containers with reserved pricing

6. Are you locked into AWS already?
   YES → Lambda is a natural extension ✓
   NO  → Consider portability (containers are more portable)

Real-World Serverless Patterns

Pattern 1: API Gateway + Lambda + DynamoDB (REST API)

Client → API Gateway → Lambda → DynamoDB
                  ↕                  ↕
             Auth (Cognito)     DAX (cache)

# Characteristics:
# - Scales to millions of requests
# - Costs $0 at zero traffic
# - Sub-second cold starts with Node.js/Python
# - DynamoDB provides single-digit ms latency

Pattern 2: Fan-Out Processing

S3 Upload → Lambda (dispatcher) → SNS Topic
                                      ├→ Lambda: generate thumbnail
                                      ├→ Lambda: extract EXIF metadata
                                      ├→ Lambda: run content moderation
                                      └→ Lambda: update search index

# Each downstream Lambda runs in parallel
# Total processing time = max(individual times)
# Not sum(individual times)

Pattern 3: Event Sourcing with DynamoDB Streams

App → DynamoDB (writes) → DynamoDB Stream → Lambda
                                                 ├→ Update Elasticsearch
                                                 ├→ Invalidate cache
                                                 ├→ Send notification
                                                 └→ Replicate to analytics DB

# DynamoDB Streams guarantee ordering per partition key
# Lambda processes in batches (configurable 1-10,000 records)
# Exactly-once processing with idempotency keys

Pattern 4: CQRS with Serverless

# Write path (commands)
API Gateway → Lambda → DynamoDB (write model)
                            ↓ (Stream)
                       Lambda → Elasticsearch (read model)

# Read path (queries)
API Gateway → Lambda → Elasticsearch
                   or
API Gateway → Lambda → DynamoDB (if simple key-value lookups)

# Different scaling, different data models for reads vs writes

Serverless vs Containers vs VMs

DimensionLambda (Serverless)ECS Fargate (Containers)EC2 (VMs)
Scaling speedMilliseconds (per request)30-90 seconds2-5 minutes
Scale to zeroYes — $0 at idleYes (with scale-to-zero config)No — minimum 1 instance
Max execution15 minUnlimitedUnlimited
Ops burdenNear zeroLow (still need task definitions, networking)High (patching, AMIs, capacity)
Cost at low trafficCheapest (free tier covers most)ModerateMost expensive (always on)
Cost at high trafficMost expensiveModerateCheapest (reserved instances)
PortabilityLowest (vendor-specific)High (Docker is portable)Highest (any cloud or on-prem)

Key Takeaways

In the next post, we explore Data Pipelines — how to build reliable, scalable systems for moving and transforming data at scale, often using serverless components as building blocks.