Serverless Architecture
What Is Serverless?
"Serverless" doesn't mean no servers — it means you don't manage them. The cloud provider handles provisioning, scaling, patching, and capacity planning. You write code; they run it. Two broad categories fall under the serverless umbrella:
| Category | What It Means | Examples |
|---|---|---|
| FaaS (Function as a Service) | Deploy individual functions that execute in response to events. No long-running process. | AWS Lambda, Google Cloud Functions, Azure Functions, Cloudflare Workers |
| BaaS (Backend as a Service) | Fully managed backend components — auth, database, file storage, push notifications — exposed via APIs. | Firebase (Firestore, Auth, Storage), Auth0, Supabase, AWS Amplify |
Modern serverless applications usually combine both: FaaS for custom business logic and BaaS for commodity services like authentication and storage.
FaaS — Function as a Service
Major Providers at a Glance
| Feature | AWS Lambda | Google Cloud Functions | Azure Functions |
|---|---|---|---|
| Max timeout | 15 min | 60 min (2nd gen) | Unlimited (Premium plan) |
| Max memory | 10,240 MB | 32,768 MB | 14,336 MB |
| Max package size | 50 MB zipped / 250 MB unzipped (10 GB with container images) | 100 MB source / container images | No hard limit (Consumption: ~1.5 GB) |
| Languages | Python, Node.js, Java, Go, .NET, Ruby, Rust (custom runtime) | Node.js, Python, Go, Java, .NET, Ruby, PHP | C#, JavaScript, Python, Java, PowerShell, TypeScript |
| Concurrency | 1,000 default (can raise to 10K+) | Up to 1,000 per function (2nd gen) | 200 per instance (Premium) |
| Free tier | 1M requests + 400K GB-s/mo | 2M invocations + 400K GB-s/mo | 1M requests + 400K GB-s/mo |
AWS Lambda Configuration Deep Dive
A real-world Lambda function definition using the Serverless Framework (serverless.yml):
service: image-processor
provider:
name: aws
runtime: python3.12
region: us-east-1
memorySize: 1024 # MB — also determines CPU allocation
timeout: 30 # seconds (max 900)
architecture: arm64 # Graviton2 — 20% cheaper, often faster
environment:
BUCKET_NAME: ${self:custom.bucketName}
TABLE_NAME: ${self:custom.tableName}
iam:
role:
statements:
- Effect: Allow
Action:
- s3:GetObject
- s3:PutObject
Resource: arn:aws:s3:::${self:custom.bucketName}/*
- Effect: Allow
Action:
- dynamodb:PutItem
- dynamodb:GetItem
Resource: arn:aws:dynamodb:us-east-1:*:table/${self:custom.tableName}
functions:
processImage:
handler: handler.process_image
memorySize: 2048 # override provider default
timeout: 60
reservedConcurrency: 100 # max concurrent executions
provisionedConcurrency: 5 # keep 5 warm instances
events:
- s3:
bucket: ${self:custom.bucketName}
event: s3:ObjectCreated:*
rules:
- prefix: uploads/
- suffix: .jpg
layers:
- arn:aws:lambda:us-east-1:770693421928:layer:Klayers-p312-Pillow:1
getImage:
handler: handler.get_image
memorySize: 256
timeout: 10
events:
- httpApi:
path: /images/{id}
method: GET
custom:
bucketName: my-image-bucket-${sls:stage}
tableName: image-metadata-${sls:stage}
The corresponding Python handler:
import json
import boto3
import os
from PIL import Image
from io import BytesIO
from datetime import datetime
s3 = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(os.environ['TABLE_NAME'])
# This code runs once per container lifecycle (init phase)
print("Cold start: initializing clients and dependencies")
def process_image(event, context):
"""Triggered when a .jpg is uploaded to uploads/ prefix."""
record = event['Records'][0]
bucket = record['s3']['bucket']['name']
key = record['s3']['object']['key']
size = record['s3']['object']['size']
# Download original image
response = s3.get_object(Bucket=bucket, Key=key)
img = Image.open(BytesIO(response['Body'].read()))
# Generate thumbnail (320x320 max)
img.thumbnail((320, 320), Image.LANCZOS)
buffer = BytesIO()
img.save(buffer, 'JPEG', quality=85)
buffer.seek(0)
# Upload thumbnail
thumb_key = key.replace('uploads/', 'thumbnails/')
s3.put_object(
Bucket=bucket, Key=thumb_key,
Body=buffer.getvalue(),
ContentType='image/jpeg'
)
# Store metadata
table.put_item(Item={
'image_id': key.split('/')[-1].split('.')[0],
'original_key': key,
'thumbnail_key': thumb_key,
'original_size': size,
'width': img.width,
'height': img.height,
'processed_at': datetime.utcnow().isoformat(),
'remaining_ms': context.get_remaining_time_in_millis()
})
return {
'statusCode': 200,
'body': json.dumps({
'message': 'Thumbnail created',
'thumbnail': thumb_key
})
}
def get_image(event, context):
"""GET /images/{id} — return metadata from DynamoDB."""
image_id = event['pathParameters']['id']
result = table.get_item(Key={'image_id': image_id})
if 'Item' not in result:
return {'statusCode': 404, 'body': '{"error":"not found"}'}
return {
'statusCode': 200,
'body': json.dumps(result['Item'], default=str)
}
BaaS — Backend as a Service
BaaS eliminates entire backend components by providing them as managed APIs:
| Service | What It Replaces | Key Features |
|---|---|---|
| Firebase Firestore | Database + real-time sync | NoSQL document DB, real-time listeners, offline persistence, security rules |
| Firebase Auth | Auth server + session mgmt | Email/password, OAuth (Google, GitHub, Apple), phone auth, anonymous auth |
| Auth0 | Enterprise identity platform | SSO, MFA, RBAC, SAML/OIDC, machine-to-machine tokens, passwordless |
| Supabase | Postgres + REST API + auth | Open-source Firebase alternative, row-level security, real-time subscriptions |
| AWS Amplify | Full backend | GraphQL API (AppSync), auth (Cognito), storage (S3), hosting, CI/CD |
A typical BaaS pattern: a React or mobile app talks directly to Firebase for auth and real-time data. When custom logic is needed (e.g., payment processing, image resize), a Cloud Function handles it. No Express server, no database management, no infrastructure to maintain.
Execution Model
Understanding the serverless execution lifecycle is critical for performance tuning:
The Request Lifecycle
Event Trigger (API Gateway, S3, SQS, etc.)
│
▼
┌─────────────────────────────────────────────────┐
│ Is a warm container available? │
│ YES → Skip to "Invoke Handler" │
│ NO → COLD START │
│ 1. Provision execution environment │ ~100-300ms
│ 2. Download deployment package │ ~50-200ms
│ 3. Start runtime (JVM, Node, Python) │ ~50-500ms
│ 4. Run initialization code (imports, │ ~varies
│ SDK clients, DB connections) │
└─────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Invoke Handler │
│ - Receive event + context │
│ - Execute business logic │
│ - Return response │
│ Duration: billed per 1ms (min 1ms) │
└─────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ Container kept warm (~5-15 minutes) │
│ - Reused for subsequent invocations │
│ - Init code NOT re-run │
│ - Handler variables persist in memory │
│ - /tmp directory (10 GB) persists │
│ No requests → container destroyed │
└─────────────────────────────────────────────────┘
▶ Serverless Execution Flow
Step through the lifecycle: cold start → execution → warm reuse → container teardown. Watch the timing comparison.
Key Execution Details
- Stateless by design: Each invocation is independent. Store state in DynamoDB, S3, or ElastiCache — never rely on in-memory data persisting between invocations.
- /tmp is ephemeral: Up to 10 GB of scratch space per container, but it's destroyed when the container is recycled. Use it for temporary file processing, not for caching across hours.
- Concurrency model: Each concurrent request gets its own container. 100 simultaneous requests = 100 containers. This is fundamentally different from a Node.js server handling 100 requests on one event loop.
- Init code runs once per container: Place SDK client initialization, database connections, and module imports outside the handler to reuse them across warm invocations.
# GOOD — initialized once per container lifecycle
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('my-table')
def handler(event, context):
# reuses the existing DynamoDB connection
return table.get_item(Key={'id': event['id']})
# BAD — creates a new client on every invocation
def handler_bad(event, context):
dynamodb = boto3.resource('dynamodb') # 50-100ms overhead per call!
table = dynamodb.Table('my-table')
return table.get_item(Key={'id': event['id']})
Cold Starts — The Serverless Tax
Cold starts are the most discussed limitation of serverless. They occur when a new execution environment must be created to serve a request. Here are real-world benchmarks:
Cold Start Benchmarks by Runtime
| Runtime | Cold Start (p50) | Cold Start (p99) | Warm Invocation | Notes |
|---|---|---|---|---|
| Python 3.12 | ~180 ms | ~400 ms | ~2-5 ms | Great for most workloads |
| Node.js 20 | ~170 ms | ~350 ms | ~2-4 ms | V8 snapshots help |
| Go 1.x | ~80 ms | ~180 ms | ~1-2 ms | Single static binary — fastest cold starts |
| Rust (custom runtime) | ~12 ms | ~30 ms | <1 ms | Minimal runtime overhead |
| Java 21 (no SnapStart) | ~3,000 ms | ~6,000 ms | ~3-8 ms | JVM startup is brutal |
| Java 21 (SnapStart) | ~200 ms | ~500 ms | ~3-8 ms | CRaC-based snapshot — 10-15× improvement |
| .NET 8 (AOT) | ~250 ms | ~500 ms | ~2-5 ms | Native AOT avoids CLR startup |
- Deployment package size: 50 MB zip → ~150ms extra. Use layers judiciously and strip unnecessary files.
- VPC attachment: Used to add 8–10 seconds (ENI creation). Now ~1 second with Hyperplane ENIs, but still significant.
- Init code complexity: Heavy imports (pandas, numpy, boto3) or DB connection pooling can add 200-500ms.
- Memory allocation: More memory = more CPU = slightly faster init. The 1,769 MB sweet spot (1 vCPU) is common.
Mitigating Cold Starts
1. Provisioned Concurrency
Pre-warms a specified number of execution environments. They're always ready — zero cold starts for those instances.
# AWS CLI — set provisioned concurrency
aws lambda put-provisioned-concurrency-config \
--function-name my-api-handler \
--qualifier prod \ # alias or version (not $LATEST)
--provisioned-concurrent-executions 50
# Cost: ~$0.015/GB-hour for provisioned concurrency
# 50 instances × 512MB × 24h × 30d = 50 × 0.5 × 720 = 18,000 GB-hours
# Monthly cost: 18,000 × $0.015 = $270/month
# Plus $0.035 per 100ms of actual execution (discounted from normal rate)
2. Keep-Warm (Ping) Strategy
# CloudWatch scheduled event — invoke every 5 minutes
# serverless.yml
functions:
apiHandler:
handler: handler.main
events:
- httpApi: 'GET /api/{proxy+}'
- schedule:
rate: rate(5 minutes)
input:
source: 'serverless-warmup'
# In handler:
def main(event, context):
if event.get('source') == 'serverless-warmup':
return {'statusCode': 200, 'body': 'warm'}
# ... actual logic
Limitation: A keep-warm ping only keeps one container warm. If you need 10 concurrent warm instances, you need to fire 10 concurrent pings — which is fragile. Provisioned concurrency is the robust solution.
3. SnapStart (Java)
# Enable SnapStart for Java Lambda functions
aws lambda update-function-configuration \
--function-name my-java-function \
--snap-start ApplyOn=PublishedVersions
# Takes a CRaC snapshot after init, restores from it on cold start
# Reduces Java cold start from ~3-6 seconds to ~200-500ms
4. Minimize Package Size
# Python — exclude unnecessary files
package:
individually: true
patterns:
- '!node_modules/**'
- '!tests/**'
- '!.git/**'
- '!**/*.pyc'
- '!**/__pycache__/**'
# Use Lambda Layers for large dependencies
# Pillow layer: ~20MB instead of bundling in each function
# boto3 is pre-installed — don't include it in your package!
Event Sources
Serverless functions are event-driven. Understanding the invocation models is crucial:
| Source | Invocation Type | Retry Behavior | Use Case |
|---|---|---|---|
| API Gateway | Synchronous | No retries (caller retries) | REST/HTTP APIs, WebSockets |
| S3 Events | Asynchronous | 2 retries, then DLQ | File upload processing, ETL |
| SQS | Polling (event source mapping) | Visibility timeout, DLQ after N fails | Work queues, decoupled processing |
| DynamoDB Streams | Polling (event source mapping) | Retries until expiry (24h), blocks shard | Change data capture, materialized views |
| Kinesis | Polling (event source mapping) | Retries until data expires (7d default) | Real-time streaming, analytics |
| SNS | Asynchronous | 3 retries (immediate, 1s, 2s) | Fan-out, notifications |
| EventBridge | Asynchronous | Configurable retries + DLQ | Event bus, cross-service events |
| CloudWatch Events/Cron | Asynchronous | 2 retries | Scheduled tasks, cron jobs |
Invocation Model Details
# Synchronous — caller waits for response
response = lambda_client.invoke(
FunctionName='my-function',
InvocationType='RequestResponse', # synchronous
Payload=json.dumps({'key': 'value'})
)
result = json.loads(response['Payload'].read())
# Asynchronous — fire and forget, Lambda handles retries
lambda_client.invoke(
FunctionName='my-function',
InvocationType='Event', # async — returns 202 immediately
Payload=json.dumps({'key': 'value'})
)
# Event Source Mapping (polling) — Lambda polls SQS/Kinesis/DynamoDB
# and invokes your function with batches of records
aws lambda create-event-source-mapping \
--function-name process-orders \
--event-source-arn arn:aws:sqs:us-east-1:123:order-queue \
--batch-size 10 \
--maximum-batching-window-in-seconds 5 \
--function-response-types ReportBatchItemFailures
▶ Event-Driven Serverless Pipeline
Follow an image upload through S3 → Lambda → DynamoDB → SNS notification.
Step Functions & Orchestration
Individual Lambda functions are great for simple tasks, but real workflows involve sequences, branches, retries, and parallel execution. AWS Step Functions provides a state machine abstraction for orchestrating serverless workflows.
State Machine Definition (ASL — Amazon States Language)
{
"Comment": "Image processing pipeline",
"StartAt": "ValidateImage",
"States": {
"ValidateImage": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123:function:validate-image",
"Next": "CheckFormat",
"Retry": [
{
"ErrorEquals": ["States.TaskFailed"],
"IntervalSeconds": 2,
"MaxAttempts": 3,
"BackoffRate": 2.0
}
],
"Catch": [
{
"ErrorEquals": ["ValidationError"],
"Next": "RejectImage"
}
]
},
"CheckFormat": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.format",
"StringEquals": "RAW",
"Next": "ConvertToJPEG"
}
],
"Default": "ProcessInParallel"
},
"ConvertToJPEG": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123:function:convert-to-jpeg",
"Next": "ProcessInParallel"
},
"ProcessInParallel": {
"Type": "Parallel",
"Branches": [
{
"StartAt": "GenerateThumbnail",
"States": {
"GenerateThumbnail": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123:function:thumbnail",
"End": true
}
}
},
{
"StartAt": "ExtractMetadata",
"States": {
"ExtractMetadata": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123:function:metadata",
"End": true
}
}
},
{
"StartAt": "RunModeration",
"States": {
"RunModeration": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123:function:moderate",
"End": true
}
}
}
],
"Next": "StoreResults"
},
"StoreResults": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123:function:store-results",
"Next": "NotifyUser"
},
"NotifyUser": {
"Type": "Task",
"Resource": "arn:aws:states:::sns:publish",
"Parameters": {
"TopicArn": "arn:aws:sns:us-east-1:123:image-notifications",
"Message.$": "$.message"
},
"End": true
},
"RejectImage": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123:function:reject-image",
"End": true
}
}
}
Step Functions: Standard vs Express
| Feature | Standard | Express |
|---|---|---|
| Max duration | 1 year | 5 minutes |
| Execution guarantee | Exactly-once | At-least-once |
| Execution history | Full audit trail, visual debugger | CloudWatch Logs only |
| Pricing | $0.025 per 1,000 state transitions | Based on executions + duration |
| Best for | Long-running workflows, human approval steps | High-volume, short-lived event processing |
Cost Model
Serverless pricing is granular and can be surprisingly cheap at low-to-moderate scale — or shockingly expensive at high throughput.
AWS Lambda Pricing Breakdown
| Component | Price | Free Tier |
|---|---|---|
| Requests | $0.20 per 1M requests | 1M requests/month |
| Duration (x86) | $0.0000166667 per GB-second | 400,000 GB-seconds/month |
| Duration (ARM/Graviton) | $0.0000133334 per GB-second (20% cheaper) | 400,000 GB-seconds/month |
| Provisioned Concurrency | $0.0000041667 per GB-second (provisioned) + $0.0000097222 per GB-second (execution) | None |
Cost Calculation Examples
Scenario 1: Light API (Startup)
Requests: 500,000/month
Memory: 256 MB
Avg time: 100ms
Arch: ARM (Graviton)
Request cost: 0 (under 1M free tier)
Duration: 500,000 × 0.1s × 0.25 GB = 12,500 GB-seconds
12,500 - 400,000 (free tier) = 0 (under free tier!)
Total: $0/month ← Genuinely free for light workloads
Scenario 2: Moderate API (Growing Product)
Requests: 10,000,000/month (10M)
Memory: 512 MB
Avg time: 200ms
Arch: ARM
Request cost: (10M - 1M) × $0.20/1M = $1.80
Duration: 10M × 0.2s × 0.5 GB = 1,000,000 GB-s
(1,000,000 - 400,000) × $0.0000133334 = $8.00
Total: ~$9.80/month ← Still incredibly cheap
Scenario 3: High Traffic (At Scale)
Requests: 100,000,000/month (100M)
Memory: 1,024 MB
Avg time: 300ms
Arch: x86
Request cost: (100M - 1M) × $0.20/1M = $19.80
Duration: 100M × 0.3s × 1.0 GB = 30,000,000 GB-s
(30M - 400K) × $0.0000166667 = $493.49
API Gateway: 100M × $1.00/1M = $100.00 ← DON'T FORGET THIS!
Total: ~$613/month ← vs ~$150/month for 2× c6g.xlarge EC2
(EC2 wins at steady high-throughput)
Scenario 4: Where Serverless Gets Expensive
Requests: 1,000,000,000/month (1B)
Memory: 2,048 MB
Avg time: 500ms
Arch: x86
Request cost: (1B - 1M) × $0.20/1M = $199.80
Duration: 1B × 0.5s × 2.0 GB = 1,000,000,000 GB-s
(1B - 400K) × $0.0000166667 = $16,666.37
API Gateway: 1B × $1.00/1M = $1,000.00
Total: ~$17,866/month ← At this scale, use ECS/EKS/EC2
Equivalent EC2: ~$2,000-3,000/month
Limitations & Challenges
Hard Limits
| Limit | Value | Impact |
|---|---|---|
| Max execution time | 15 minutes (Lambda) | No long-running processes, batch jobs need chunking |
| Max concurrent executions | 1,000 default (account-level) | Shared across ALL functions — can starve other functions |
| Payload size (sync) | 6 MB request/response | Large files must go through S3 |
| Payload size (async) | 256 KB | Pass S3 references, not data |
| /tmp storage | 10 GB (configurable) | Ephemeral, shared across warm invocations |
| Environment variables | 4 KB total | Use SSM Parameter Store or Secrets Manager for large configs |
Operational Challenges
- Debugging complexity: No SSH into a running function. Distributed tracing (X-Ray, Datadog) becomes essential. Reproducing issues locally requires tools like SAM Local or LocalStack.
- Vendor lock-in: Event source mappings, IAM policies, Lambda layers, and Step Functions are deeply AWS-specific. Migrating to GCP or Azure means rewriting significant infrastructure code. The Serverless Framework and Terraform mitigate this somewhat, but the abstractions leak.
- Testing difficulties: Unit testing the handler is easy, but integration testing with real event sources is hard. Local emulation (SAM, Serverless Offline) only approximates the real environment.
- Observability gaps: Traditional APM tools struggle with ephemeral containers. You need Lambda-aware tools: AWS X-Ray for tracing, CloudWatch Lambda Insights for metrics, structured JSON logging with correlation IDs.
- Statelessness: Every invocation starts fresh (ignoring warm container reuse). Workflows requiring state need external storage (DynamoDB, Redis, Step Functions).
- Timeout cliff: If a function approaches its timeout, there's no graceful shutdown. Use
context.get_remaining_time_in_millis()to checkpoint before timeout.
# Defensive timeout handling
def handler(event, context):
items = get_batch_items()
results = []
for item in items:
# Check if we have enough time remaining (leave 5s buffer)
remaining_ms = context.get_remaining_time_in_millis()
if remaining_ms < 5000:
# Save progress and re-enqueue remaining items
save_checkpoint(results)
requeue_remaining(items[len(results):])
return {
'statusCode': 202,
'body': json.dumps({
'processed': len(results),
'remaining': len(items) - len(results),
'status': 'partial — re-queued'
})
}
results.append(process_item(item))
return {'statusCode': 200, 'body': json.dumps(results)}
When to Use Serverless
✓ Ideal Use Cases
| Use Case | Why Serverless Excels | Example |
|---|---|---|
| Event processing | Natural fit for event-driven model, auto-scales with event volume | S3 upload → resize image → store metadata |
| Webhooks | Sporadic traffic, pay nothing when idle | GitHub/Stripe/Twilio webhook handlers |
| Scheduled tasks | Replaces cron servers — no instance running 24/7 for a 5-minute job | Nightly reports, data cleanup, health checks |
| APIs with variable traffic | Scales from 0 to thousands of concurrent requests, back to 0 | Startup MVP, internal tools, seasonal apps |
| Data transformation | Parallel processing of streaming data | Kinesis → Lambda → Elasticsearch ingestion |
| Chatbots & IoT | Bursty, unpredictable traffic patterns | Alexa skills, IoT rule actions |
| Prototyping & MVPs | Zero infrastructure cost until you have users, rapid iteration | API + DynamoDB + S3 — full stack in serverless.yml |
✗ When NOT to Use Serverless
| Anti-Pattern | Why It Fails | Better Alternative |
|---|---|---|
| Long-running processes | 15-min max execution time. Video transcoding, ML training, and large batch jobs time out. | ECS Fargate tasks, AWS Batch, EC2 |
| Latency-sensitive (<10ms) | Cold starts add 100ms–6s of latency. Even provisioned concurrency adds overhead vs bare metal. | EC2, EKS with pod pre-scaling |
| High-throughput steady workloads | At 100M+ requests/month with consistent load, per-invocation billing is 5-10× more expensive than reserved capacity. | ECS/EKS with auto-scaling, reserved EC2 |
| WebSocket/persistent connections | Stateless execution model doesn't support long-lived connections natively. API Gateway WebSocket exists but is awkward. | ECS with Socket.io, dedicated WebSocket servers |
| Complex stateful workflows | Forcing state management through DynamoDB + Step Functions adds complexity that a simple server avoids. | Temporal/Cadence on ECS, traditional servers |
| Heavy local computation | Max 10 GB RAM, 6 vCPUs. Large-scale data processing, ML inference on large models, and GPU workloads are out. | EC2 with GPUs, SageMaker, EMR |
Decision Framework
Should I use Serverless?
1. Is execution time < 15 minutes?
NO → Use containers (ECS/EKS) or EC2
YES ↓
2. Is traffic variable/spiky/unpredictable?
YES → Strong serverless candidate ✓
NO ↓
3. Do you need sub-10ms latency consistently?
YES → Use containers or bare metal
NO ↓
4. Is monthly request volume < 50M?
YES → Serverless is almost certainly cheaper ✓
NO ↓
5. Is engineering time more valuable than compute cost?
YES → Serverless (less ops overhead) ✓
NO → Containers with reserved pricing
6. Are you locked into AWS already?
YES → Lambda is a natural extension ✓
NO → Consider portability (containers are more portable)
Real-World Serverless Patterns
Pattern 1: API Gateway + Lambda + DynamoDB (REST API)
Client → API Gateway → Lambda → DynamoDB
↕ ↕
Auth (Cognito) DAX (cache)
# Characteristics:
# - Scales to millions of requests
# - Costs $0 at zero traffic
# - Sub-second cold starts with Node.js/Python
# - DynamoDB provides single-digit ms latency
Pattern 2: Fan-Out Processing
S3 Upload → Lambda (dispatcher) → SNS Topic
├→ Lambda: generate thumbnail
├→ Lambda: extract EXIF metadata
├→ Lambda: run content moderation
└→ Lambda: update search index
# Each downstream Lambda runs in parallel
# Total processing time = max(individual times)
# Not sum(individual times)
Pattern 3: Event Sourcing with DynamoDB Streams
App → DynamoDB (writes) → DynamoDB Stream → Lambda
├→ Update Elasticsearch
├→ Invalidate cache
├→ Send notification
└→ Replicate to analytics DB
# DynamoDB Streams guarantee ordering per partition key
# Lambda processes in batches (configurable 1-10,000 records)
# Exactly-once processing with idempotency keys
Pattern 4: CQRS with Serverless
# Write path (commands)
API Gateway → Lambda → DynamoDB (write model)
↓ (Stream)
Lambda → Elasticsearch (read model)
# Read path (queries)
API Gateway → Lambda → Elasticsearch
or
API Gateway → Lambda → DynamoDB (if simple key-value lookups)
# Different scaling, different data models for reads vs writes
Serverless vs Containers vs VMs
| Dimension | Lambda (Serverless) | ECS Fargate (Containers) | EC2 (VMs) |
|---|---|---|---|
| Scaling speed | Milliseconds (per request) | 30-90 seconds | 2-5 minutes |
| Scale to zero | Yes — $0 at idle | Yes (with scale-to-zero config) | No — minimum 1 instance |
| Max execution | 15 min | Unlimited | Unlimited |
| Ops burden | Near zero | Low (still need task definitions, networking) | High (patching, AMIs, capacity) |
| Cost at low traffic | Cheapest (free tier covers most) | Moderate | Most expensive (always on) |
| Cost at high traffic | Most expensive | Moderate | Cheapest (reserved instances) |
| Portability | Lowest (vendor-specific) | High (Docker is portable) | Highest (any cloud or on-prem) |
Key Takeaways
- Serverless = FaaS + BaaS. Functions for custom logic, managed services for everything else. The goal is to write only business logic and let the cloud handle the rest.
- Cold starts are real but manageable. Choose lightweight runtimes (Node.js, Python, Go, Rust), minimize package size, use provisioned concurrency for latency-sensitive paths, and SnapStart for Java.
- Cost is non-linear. Serverless is nearly free at low scale and becomes expensive at high steady throughput. The crossover point is typically 10-50M requests/month — model your costs before committing.
- Event-driven is the natural model. Serverless shines when functions react to events (uploads, queue messages, database changes). If you're fighting to make it fit a synchronous, stateful, long-running workload, you're using the wrong tool.
- Vendor lock-in is the biggest hidden cost. Lambda, Step Functions, EventBridge, and DynamoDB are deeply intertwined. Use infrastructure-as-code (Terraform, CDK) and keep business logic portable in separate modules.
- Orchestrate, don't chain. Use Step Functions for multi-step workflows instead of Lambda-calling-Lambda. Step Functions handle retries, timeouts, branching, and parallel execution with built-in visibility.
In the next post, we explore Data Pipelines — how to build reliable, scalable systems for moving and transforming data at scale, often using serverless components as building blocks.