Proxies: Forward, Reverse, API Gateway
Every request in a modern system travels through at least one proxy before reaching its destination. Proxies are the traffic controllers of distributed systems — they sit between clients and servers, intercepting requests to add security, improve performance, and simplify architecture. Whether you’re designing a CDN, building a microservices platform, or answering a system design interview, understanding proxies is non-negotiable.
This post covers every proxy type you’ll encounter: forward proxies, reverse proxies, API gateways, sidecar proxies, and full service meshes. We’ll walk through real Nginx configs, Kong plugin setups, Envoy sidecar patterns, and Istio architecture — with interactive animations to solidify each concept.
What is a Proxy?
A proxy (from Latin procurator, “one who acts on behalf of another”) is an intermediary server that sits between a client and a destination server. Instead of the client talking directly to the server, the client talks to the proxy, and the proxy forwards the request onward.
Why do proxies exist?
Proxies exist because direct client-to-server communication lacks control. In the real world, you need to:
- Security — Hide internal server topology, terminate TLS, enforce authentication
- Performance — Cache responses, compress payloads, load balance across backends
- Privacy — Mask client IP addresses, filter content, enforce policies
- Observability — Log requests, collect metrics, trace distributed calls
- Abstraction — Present a single endpoint for multiple backend services
Proxy vs Gateway vs Load Balancer
These terms overlap significantly. Here’s how they relate:
| Concept | Primary Role | OSI Layer | Example |
|---|---|---|---|
| Forward Proxy | Client-side intermediary | L7 (HTTP) | Squid, corporate proxy |
| Reverse Proxy | Server-side intermediary | L7 (HTTP) | Nginx, HAProxy |
| Load Balancer | Distribute traffic to backends | L4/L7 | AWS ALB/NLB, F5 |
| API Gateway | Single entry for microservices | L7 (HTTP) | Kong, AWS API GW |
| Sidecar Proxy | Per-service network agent | L4/L7 | Envoy, Linkerd-proxy |
A reverse proxy is a load balancer if it distributes traffic. An API gateway is a reverse proxy with extra features. A sidecar proxy is a reverse proxy running per-service. The distinctions are about scope and responsibility, not fundamental architecture.
Forward Proxy
A forward proxy sits on the client side. The client explicitly configures its network stack to send requests through the proxy, and the proxy forwards them to the internet on the client’s behalf. The destination server sees the proxy’s IP address, not the client’s.
Key characteristics
- Client awareness: The client knows it’s using a proxy (configured in browser/OS settings, or via
HTTP_PROXYenvironment variable) - Server ignorance: The destination server has no idea a proxy is involved — it just sees a normal HTTP request from the proxy’s IP
- Outbound direction: Forward proxies control outbound traffic from a private network to the public internet
Use cases
🔒 Privacy / Anonymity
Mask your IP address. The server sees the proxy’s IP, not yours. VPN services are essentially encrypted forward proxies.
🚫 Content Filtering
Block access to social media, gambling, or malware sites. Schools and enterprises use this heavily.
🏢 Corporate Networks
All employee traffic goes through the proxy for monitoring, compliance, and bandwidth management.
📦 Caching
Cache frequently accessed static resources (JS, CSS, images) to reduce bandwidth and improve latency.
Forward proxy with Squid (config example)
# /etc/squid/squid.conf
http_port 3128
# Allow only internal network
acl internal_network src 10.0.0.0/8
http_access allow internal_network
http_access deny all
# Cache settings
cache_dir ufs /var/spool/squid 10000 16 256
maximum_object_size 100 MB
# Block social media
acl blocked_sites dstdomain .facebook.com .twitter.com .tiktok.com
http_access deny blocked_sites
# Logging
access_log /var/log/squid/access.log squid
Forward proxy vs VPN
| Feature | Forward Proxy | VPN |
|---|---|---|
| OSI Layer | Application (L7) | Network (L3) |
| Encryption | Optional (HTTPS CONNECT) | Always (IPsec/WireGuard) |
| Scope | HTTP/HTTPS traffic only | All traffic (TCP, UDP, etc.) |
| Client config | Browser/app proxy settings | OS-level tunnel |
| Performance | Faster (less overhead) | Slightly slower (encryption) |
| Use case | Web filtering, caching | Full network privacy |
Reverse Proxy
A reverse proxy sits on the server side. Clients on the internet send requests to the reverse proxy’s public IP, and the proxy forwards them to the appropriate backend server. The client has no idea which backend actually handled the request.
Key characteristics
- Client ignorance: The client has no idea a proxy is involved. It just sends a request to
api.example.com. - Server awareness: Backend servers know they’re behind a proxy (they receive
X-Forwarded-Forheaders with the real client IP). - Inbound direction: Reverse proxies control inbound traffic from the public internet to private backend servers.
Use cases in depth
⚖️ Load Balancing
Distribute requests across multiple backends using round-robin, least-connections, IP hash, or weighted algorithms.
🔒 SSL/TLS Termination
Handle expensive TLS handshakes at the proxy, so backends only deal with plain HTTP internally.
📦 Response Caching
Cache static assets and API responses. Reduces backend load by 80%+ for read-heavy workloads.
🚗 Compression
Gzip or Brotli compress responses before sending to clients, reducing bandwidth by 60-80%.
🛡️ Security Shield
Hide internal topology, block malicious requests, rate limit by IP, add security headers (CORS, CSP, HSTS).
🌐 A/B Testing & Canary
Route a percentage of traffic to a new version while the rest goes to the stable version.
Forward vs Reverse Proxy — Request Flow
Nginx as a reverse proxy — complete config
# /etc/nginx/nginx.conf
worker_processes auto;
events {
worker_connections 4096;
multi_accept on;
}
http {
# ── Upstream backend pool ──
upstream api_backend {
least_conn; # Least-connections algorithm
server 10.0.1.1:8080 weight=3; # 3x traffic
server 10.0.1.2:8080 weight=2; # 2x traffic
server 10.0.1.3:8080 weight=1; # 1x traffic
server 10.0.1.4:8080 backup; # Only used when others are down
keepalive 64; # Persistent connections to backends
}
# ── Response caching ──
proxy_cache_path /var/cache/nginx levels=1:2
keys_zone=api_cache:10m max_size=1g
inactive=60m use_temp_path=off;
# ── Gzip compression ──
gzip on;
gzip_types application/json text/plain text/css application/javascript;
gzip_min_length 256;
gzip_comp_level 5;
# ── Rate limiting ──
limit_req_zone $binary_remote_addr zone=api_limit:10m rate=100r/s;
server {
listen 443 ssl http2;
server_name api.example.com;
# ── TLS termination ──
ssl_certificate /etc/ssl/certs/api.example.com.pem;
ssl_certificate_key /etc/ssl/private/api.example.com.key;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256;
ssl_prefer_server_ciphers on;
# ── Security headers ──
add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
add_header X-Content-Type-Options "nosniff" always;
add_header X-Frame-Options "DENY" always;
# ── API routes ──
location /api/ {
limit_req zone=api_limit burst=50 nodelay;
proxy_pass http://api_backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# Connection reuse
proxy_http_version 1.1;
proxy_set_header Connection "";
# Timeouts
proxy_connect_timeout 5s;
proxy_send_timeout 30s;
proxy_read_timeout 30s;
# Caching GET responses
proxy_cache api_cache;
proxy_cache_valid 200 10m;
proxy_cache_valid 404 1m;
proxy_cache_use_stale error timeout updating;
add_header X-Cache-Status $upstream_cache_status;
}
# ── Static assets ──
location /static/ {
root /var/www;
expires 30d;
add_header Cache-Control "public, immutable";
}
# ── Health check endpoint ──
location /health {
access_log off;
return 200 "OK";
}
}
# ── HTTP → HTTPS redirect ──
server {
listen 80;
server_name api.example.com;
return 301 https://$host$request_uri;
}
}
HAProxy as a reverse proxy
# /etc/haproxy/haproxy.cfg
global
maxconn 50000
log stdout format raw local0
defaults
mode http
timeout connect 5s
timeout client 30s
timeout server 30s
option httplog
option forwardfor # Adds X-Forwarded-For header
frontend http_front
bind *:443 ssl crt /etc/ssl/api.pem
default_backend api_servers
# Route based on path
acl is_api path_beg /api/
acl is_ws path_beg /ws/
use_backend api_servers if is_api
use_backend ws_servers if is_ws
backend api_servers
balance leastconn
option httpchk GET /health
server s1 10.0.1.1:8080 check weight 3
server s2 10.0.1.2:8080 check weight 2
server s3 10.0.1.3:8080 check weight 1
backend ws_servers
balance source # Sticky sessions for WebSocket
server ws1 10.0.2.1:8080 check
server ws2 10.0.2.2:8080 check
Forward vs Reverse proxy: summary
| Dimension | Forward | Reverse |
|---|---|---|
| Sits on | Client side | Server side |
| Client knows? | Yes (configured) | No (transparent) |
| Server knows? | No | Yes (via X-Forwarded-For) |
| Hides | Client identity | Server identity & topology |
| Direction | Outbound (private → internet) | Inbound (internet → private) |
| Common use | Privacy, filtering, caching | Load balancing, TLS, security |
| Examples | Squid, corporate proxy | Nginx, HAProxy, Caddy |
API Gateway
An API Gateway is a reverse proxy on steroids. It serves as the single entry point for all client requests in a microservices architecture, handling cross-cutting concerns that would otherwise be duplicated across every service.
Core features
🔐 Authentication
Validate JWT tokens, OAuth2 flows, or API keys at the gateway. Services never handle auth themselves.
⚡ Rate Limiting
Enforce per-user, per-IP, or per-API-key rate limits. Protect backends from abuse and DDoS.
🔌 Request Routing
Route /api/users/* to User Service, /api/orders/* to Order Service. Path-based, header-based, or weighted.
🔄 Transformation
Transform requests (add headers, rewrite paths) and responses (filter fields, rename keys) without touching service code.
📋 Response Aggregation
Combine responses from multiple services into a single response. Mobile app gets one API call instead of five.
💥 Circuit Breaking
Stop sending traffic to a failing service. Return cached or fallback responses instead of cascading failures.
API Gateway — Request Processing Pipeline
API Gateway vs simple reverse proxy
| Feature | Reverse Proxy (Nginx) | API Gateway (Kong/Envoy) |
|---|---|---|
| Request routing | ✔ Path-based | ✔ Path, header, method, query param |
| Load balancing | ✔ Round-robin, least-conn | ✔ + Consistent hashing, canary |
| TLS termination | ✔ | ✔ |
| Authentication | ❌ Basic only | ✔ JWT, OAuth2, OIDC, API keys |
| Rate limiting | ✔ Basic (req/s per IP) | ✔ Per-user, per-route, sliding window |
| Request transformation | ❌ Limited header rewrite | ✔ Full body/header/query transformation |
| Response aggregation | ❌ | ✔ Combine multiple service responses |
| Circuit breaking | ❌ | ✔ With fallback responses |
| Plugin ecosystem | Limited modules | Rich plugin marketplace |
| Admin API | Config file reload | REST API for dynamic config |
| Observability | Access logs | Distributed tracing, Prometheus metrics |
Kong API Gateway — config example
# kong.yml — declarative configuration
_format_version: "3.0"
services:
- name: user-service
url: http://user-svc.internal:8080
routes:
- name: user-routes
paths: ["/api/v1/users"]
methods: ["GET", "POST", "PUT", "DELETE"]
strip_path: false
- name: order-service
url: http://order-svc.internal:8080
routes:
- name: order-routes
paths: ["/api/v1/orders"]
strip_path: false
plugins:
# ── Global JWT Authentication ──
- name: jwt
config:
claims_to_verify: ["exp"]
header_names: ["Authorization"]
# ── Rate limiting per consumer ──
- name: rate-limiting
config:
minute: 100
hour: 5000
policy: redis
redis_host: redis.internal
# ── Request size limiting ──
- name: request-size-limiting
config:
allowed_payload_size: 10 # MB
# ── Response transformation ──
- name: response-transformer
service: user-service
config:
remove:
headers: ["X-Internal-Id", "X-Debug-Info"]
add:
headers: ["X-Api-Version:v1"]
# ── Prometheus metrics ──
- name: prometheus
config:
per_consumer: true
status_code_metrics: true
# ── Circuit breaker (via custom plugin or upstream config) ──
consumers:
- username: mobile-app
jwt_secrets:
- key: mobile-app-key
algorithm: HS256
secret: "{vault://env/MOBILE_JWT_SECRET}"
- username: partner-api
jwt_secrets:
- key: partner-key
algorithm: RS256
rsa_public_key: "{vault://env/PARTNER_PUBLIC_KEY}"
Popular API gateways compared
| Gateway | Language | Plugin Model | Best For |
|---|---|---|---|
| Kong | Lua (OpenResty) | Lua/Go plugins | General purpose, enterprise |
| AWS API Gateway | Managed service | Lambda authorizers | AWS-native, serverless |
| Envoy | C++ | Wasm/Lua filters | High performance, service mesh |
| Traefik | Go | Middleware chain | Kubernetes-native, auto-discovery |
| APISIX | Lua (OpenResty) | Lua/Wasm plugins | High throughput, etcd config |
| Zuul 2 | Java | Filters | Netflix/Spring ecosystem |
Sidecar Proxy
A sidecar proxy is a small proxy process deployed alongside every service instance. Instead of a single centralized gateway, every service gets its own personal proxy that handles networking concerns. The application code doesn’t even know the sidecar exists — it just sends requests to localhost.
What the sidecar handles
- Mutual TLS (mTLS) — Automatic certificate rotation, encryption between all services. Your app talks plain HTTP; the sidecar encrypts it.
- Retries with backoff — If a downstream call fails, the sidecar retries with exponential backoff + jitter. No retry logic in your code.
- Circuit breaking — The sidecar tracks failure rates and opens the circuit when a downstream is unhealthy.
- Load balancing — Client-side load balancing across multiple instances of a downstream service.
- Observability — Automatically emits metrics (latency, request count, error rate), distributed traces (Zipkin/Jaeger headers), and access logs.
- Traffic shaping — Canary deployments, A/B testing, fault injection for chaos engineering.
Envoy as a sidecar — config snippet
# envoy.yaml — sidecar configuration
static_resources:
listeners:
- name: inbound
address:
socket_address: { address: 0.0.0.0, port_value: 15006 }
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: inbound
route_config:
virtual_hosts:
- name: local_service
domains: ["*"]
routes:
- match: { prefix: "/" }
route: { cluster: local_app }
http_filters:
- name: envoy.filters.http.router
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
- name: outbound
address:
socket_address: { address: 0.0.0.0, port_value: 15001 }
filter_chains:
- filters:
- name: envoy.filters.network.http_connection_manager
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.network.http_connection_manager.v3.HttpConnectionManager
stat_prefix: outbound
route_config:
virtual_hosts:
- name: order_service
domains: ["order-svc.internal"]
routes:
- match: { prefix: "/" }
route:
cluster: order_service
retry_policy:
retry_on: "5xx,connect-failure"
num_retries: 3
per_try_timeout: 2s
http_filters:
- name: envoy.filters.http.router
typed_config:
"@type": type.googleapis.com/envoy.extensions.filters.http.router.v3.Router
clusters:
- name: local_app
connect_timeout: 1s
type: STATIC
load_assignment:
cluster_name: local_app
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address: { address: 127.0.0.1, port_value: 8080 }
- name: order_service
connect_timeout: 5s
type: STRICT_DNS
lb_policy: ROUND_ROBIN
circuit_breakers:
thresholds:
- max_connections: 1024
max_pending_requests: 1024
max_retries: 3
load_assignment:
cluster_name: order_service
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address: { address: order-svc.internal, port_value: 8080 }
Service Mesh
A service mesh is what you get when you deploy sidecar proxies to every service in your system and connect them with a centralized control plane. It’s a dedicated infrastructure layer for handling service-to-service communication.
Data plane vs control plane
| Aspect | Data Plane | Control Plane |
|---|---|---|
| What is it? | All the sidecar proxies (Envoy instances) | Management software (Istiod) |
| Handles | Actual network traffic between services | Configuration, certificates, policies |
| On the request path? | Yes — every request passes through | No — only configures the data plane |
| Performance impact | ~1ms per hop latency | None on request latency |
| Failure mode | Sidecar dies → service can’t communicate | Control plane dies → sidecars keep working with last known config |
Istio components (unified as Istiod)
- Pilot — Converts high-level routing rules into Envoy-specific xDS configuration. Pushes config updates to all sidecars.
- Citadel — Certificate authority. Issues and rotates mTLS certificates for every service. Enables zero-trust networking.
- Galley — Configuration validation. Validates Istio config (VirtualService, DestinationRule) before pushing to Pilot.
Istio traffic management example
# VirtualService — canary deployment (90/10 split)
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: order-service
spec:
hosts:
- order-svc.internal
http:
- route:
- destination:
host: order-svc.internal
subset: stable
weight: 90
- destination:
host: order-svc.internal
subset: canary
weight: 10
retries:
attempts: 3
perTryTimeout: 2s
retryOn: "5xx,connect-failure"
timeout: 10s
---
# DestinationRule — define subsets + circuit breaker
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: order-service
spec:
host: order-svc.internal
trafficPolicy:
connectionPool:
tcp:
maxConnections: 1024
http:
h2UpgradePolicy: DEFAULT
maxRequestsPerConnection: 100
outlierDetection:
consecutive5xxErrors: 5
interval: 30s
baseEjectionTime: 30s
maxEjectionPercent: 50
subsets:
- name: stable
labels:
version: v1
- name: canary
labels:
version: v2
When to use a service mesh
- You have 10+ microservices with complex inter-service communication
- You need zero-trust security (mTLS everywhere)
- You want uniform observability without instrumenting every service
- You need advanced traffic management (canary, fault injection, mirroring)
- Multiple teams deploy independently and need consistent networking policies
- You have fewer than 5 services — the operational complexity isn’t worth it
- Your team doesn’t have Kubernetes expertise
- Latency requirements are sub-millisecond (each sidecar hop adds ~1ms)
- A simpler API gateway already solves your problems
Service mesh landscape
| Mesh | Data Plane | Control Plane | Notes |
|---|---|---|---|
| Istio | Envoy | Istiod | Most popular, feature-rich, complex |
| Linkerd | linkerd2-proxy (Rust) | Custom | Simpler, lighter weight, k8s-only |
| Consul Connect | Envoy or built-in | Consul | Works outside k8s, HashiCorp ecosystem |
| Cilium | eBPF (kernel-level) | Custom | No sidecar needed, highest performance |
TLS/SSL Termination
TLS termination is the process of decrypting TLS-encrypted traffic at a specific point in your architecture. Where you choose to terminate TLS has significant implications for security, performance, and operational complexity.
Option 1: Terminate at the Load Balancer / Reverse Proxy
TLS Client → Load Balancer Plain HTTP Load Balancer → Backend
Option 2: TLS Passthrough (terminate at the service)
TLS Client → Load Balancer → Backend (end-to-end encrypted)
Option 3: TLS Re-encryption (terminate + re-encrypt)
TLS (public cert) Client → LB TLS (internal cert) LB → Backend
Option 4: mTLS everywhere (service mesh)
TLS Client → Gateway mTLS Service ↔ Service (mutual certificate verification)
TLS termination strategy comparison
| Strategy | Security | Performance | Complexity | Best For |
|---|---|---|---|---|
| At LB (HTTP backend) | ★★ | ★★★★★ | ★★★★★ | Internal/trusted networks |
| TLS Passthrough | ★★★★★ | ★★★★ | ★★★ | Compliance (e.g., PCI DSS) |
| Re-encryption | ★★★★ | ★★★ | ★★ | Defense in depth |
| mTLS (service mesh) | ★★★★★ | ★★★ | ★ | Zero-trust microservices |
Nginx TLS configuration (best practices)
server {
listen 443 ssl http2;
server_name api.example.com;
# Modern TLS config (Mozilla Intermediate)
ssl_certificate /etc/ssl/certs/fullchain.pem;
ssl_certificate_key /etc/ssl/private/privkey.pem;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_ciphers ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:
ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384;
ssl_prefer_server_ciphers off;
# OCSP stapling for faster TLS handshakes
ssl_stapling on;
ssl_stapling_verify on;
ssl_trusted_certificate /etc/ssl/certs/chain.pem;
resolver 8.8.8.8 8.8.4.4 valid=300s;
# Session resumption (reduces handshake overhead)
ssl_session_cache shared:SSL:10m;
ssl_session_timeout 1d;
ssl_session_tickets off; # Disable for perfect forward secrecy
# HSTS
add_header Strict-Transport-Security "max-age=63072000; includeSubDomains; preload" always;
location / {
proxy_pass http://backend;
}
}
Proxy in System Design Interviews
Proxies come up in almost every system design interview. Here’s exactly when and how to bring them up.
When to mention proxies
The proxy progression in interviews
| Architecture Scale | Proxy Layer | What to Say |
|---|---|---|
| Single server | None needed | “For now, one server is enough. As we scale, we’ll add a reverse proxy.” |
| Multiple servers | Reverse proxy / LB | “I’ll put Nginx/ALB in front for load balancing, TLS termination, and health checks.” |
| Microservices (3-10) | API Gateway | “An API gateway handles auth, rate limiting, and routing so each service doesn’t duplicate that logic.” |
| Many microservices (10+) | API Gateway + Service Mesh | “For internal communication, a service mesh with mTLS gives us zero-trust security and observability without code changes.” |
How proxies fit in architecture diagrams
Common interview questions involving proxies
🔍 Design a URL Shortener
Reverse proxy for load balancing + caching the redirect lookups at the proxy layer.
🔍 Design Twitter
API gateway for auth + rate limiting. CDN (reverse proxy) for media. Service mesh for internal comms.
🔍 Design a Chat System
Reverse proxy with WebSocket support. Sticky sessions or connection-level load balancing.
🔍 Design an E-Commerce Platform
API gateway aggregates product + pricing + inventory into one response for the product page.
Key phrases for interviews
- “I’ll add a reverse proxy in front of the backend servers to handle TLS termination, load balancing, and health checks.”
- “The API gateway handles cross-cutting concerns: authentication, rate limiting, and request routing, so services stay focused on business logic.”
- “For service-to-service communication, we can use a service mesh with mTLS to get zero-trust security without modifying service code.”
- “We terminate TLS at the load balancer for simplicity, but if compliance requires end-to-end encryption, we’ll use TLS passthrough or mTLS.”
- “The gateway acts as a circuit breaker, if the payment service goes down, we return a cached response instead of cascading the failure.”
Key takeaways
- Forward proxy = client-side (privacy, filtering, outbound control)
- Reverse proxy = server-side (load balancing, TLS, caching, security)
- API gateway = reverse proxy + auth + rate limiting + transformation + aggregation
- Sidecar proxy = per-service proxy handling mTLS, retries, circuit breaking
- Service mesh = sidecar proxies everywhere + centralized control plane
- TLS termination = trade-off between security (end-to-end encryption) and simplicity (terminate at LB)
- In interviews, always mention the proxy layer when drawing architecture diagrams — it shows you think about real-world operations