Service Discovery & Registration
The Problem
In a monolithic application, one service calls another through a simple in-process function call or a well-known, hard-coded address. But in a microservices architecture, everything is dynamic. Containers spin up and down, auto-scaling groups resize, blue-green deployments swap addresses, and Kubernetes pods get new IPs every time they restart. Hard-coding service addresses simply doesn’t work.
Consider a shopping application with these microservices:
# Static configuration — breaks constantly
ORDER_SERVICE=http://10.0.2.15:8080
PAYMENT_SERVICE=http://10.0.2.16:8081
INVENTORY_SERVICE=http://10.0.2.17:8082
NOTIFICATION_SERVICE=http://10.0.2.18:8083
Any time a service crashes and restarts on a different node, any time auto-scaling adds a new instance, or any time a deployment replaces pods — those addresses are stale. Requests fail. Revenue is lost. On-call engineers get paged at 3 AM.
Service discovery solves this by providing a dynamic, real-time mechanism for services to find and communicate with each other. It answers the fundamental question: “What are the current network locations of healthy instances of Service X?”
A service discovery system has three core responsibilities:
- Registration: Services announce their presence (address, port, metadata) when they start.
- Discovery: Clients query for available instances of a target service.
- Health monitoring: Unhealthy instances are detected and removed from the registry.
There are two fundamental patterns for how discovery happens: client-side and server-side.
Client-Side Discovery
In the client-side discovery pattern, the client is responsible for querying the service registry and selecting an appropriate instance. The client has built-in load-balancing logic and connects directly to the chosen instance.
This is the pattern made famous by Netflix OSS: Eureka (service registry) + Ribbon (client-side load balancer).
How It Works
- When Service B starts, it registers itself with the registry (IP, port, health URL).
- Service B sends periodic heartbeats to the registry (every 30 seconds by default in Eureka).
- When Service A needs to call Service B, it queries the registry for all healthy instances of Service B.
- Service A’s built-in load balancer (Ribbon) picks one instance using round-robin, weighted, or random strategy.
- Service A calls Service B directly — no intermediary.
// Spring Cloud Netflix — Client-Side Discovery
@SpringBootApplication
@EnableDiscoveryClient
public class OrderServiceApplication {
public static void main(String[] args) {
SpringApplication.run(OrderServiceApplication.class, args);
}
}
// application.yml — registration config
eureka:
client:
serviceUrl:
defaultZone: http://eureka-server:8761/eureka/
registryFetchIntervalSeconds: 5
instance:
leaseRenewalIntervalInSeconds: 10
leaseExpirationDurationInSeconds: 30
instanceId: ${spring.application.name}:${random.value}
metadataMap:
version: v2.1.0
region: us-east-1
// Using Ribbon for client-side load balancing
@Bean
@LoadBalanced
public RestTemplate restTemplate() {
return new RestTemplate();
}
// Call by service name — Ribbon resolves to actual IP:port
String result = restTemplate
.getForObject("http://payment-service/api/pay", String.class);
Pros and Cons
| Pros | Cons |
|---|---|
| No proxy hop — lower latency | Every client needs discovery library |
| Client can make smart routing decisions (zone-aware, version-aware) | Couples clients to the registry technology |
| No single point of failure from a load balancer | Discovery logic duplicated across every language/framework |
| Client can cache registry data, surviving brief registry outages | Harder to manage when you have polyglot services |
Server-Side Discovery
In server-side discovery, the client sends its request to a load balancer or router. The router is responsible for querying the service registry and forwarding the request to an appropriate instance. The client has no knowledge of the registry.
This is the pattern used by AWS Elastic Load Balancer (ELB), Kubernetes Services, and NGINX with service discovery plugins.
How It Works
- Service B instances register with the registry at startup.
- Service A sends a request to the load balancer (a well-known, stable address).
- The load balancer queries the registry for healthy Service B instances.
- The load balancer forwards the request to one of the instances.
- The response flows back through the load balancer to Service A.
# AWS Application Load Balancer — target group with auto-discovery
resource "aws_lb_target_group" "payment" {
name = "payment-service-tg"
port = 8080
protocol = "HTTP"
vpc_id = aws_vpc.main.id
target_type = "ip"
health_check {
path = "/actuator/health"
interval = 15
timeout = 5
healthy_threshold = 2
unhealthy_threshold = 3
matcher = "200"
}
}
# ECS service auto-registers tasks with the target group
resource "aws_ecs_service" "payment" {
name = "payment-service"
cluster = aws_ecs_cluster.main.id
task_definition = aws_ecs_task_definition.payment.arn
desired_count = 3
load_balancer {
target_group_arn = aws_lb_target_group.payment.arn
container_name = "payment"
container_port = 8080
}
}
Pros and Cons
| Pros | Cons |
|---|---|
| Clients are simple — no discovery logic needed | Extra network hop through load balancer (adds latency) |
| Language-agnostic — any HTTP client works | Load balancer can become a bottleneck or SPOF |
| Centralized traffic management, TLS termination | More infrastructure to manage and scale |
| Easy to add cross-cutting concerns (rate limiting, auth) | Load balancer must be highly available itself |
▶ Service Discovery Flow
Watch services register, send heartbeats, discover each other, and handle failure.
▶ Client-Side vs Server-Side Discovery
Side-by-side comparison of both patterns.
Service Registry
The service registry is the central database of service instance locations. It is the heart of any discovery system. Every instance that starts must register, and every client must query it. The registry must be highly available, consistent (or at least eventually consistent), and fast.
Requirements
- High availability: If the registry is down, no service can discover any other. The registry must replicate across multiple nodes.
- Consistency: Stale entries (pointing to dead instances) cause failed requests. The registry must remove dead instances quickly.
- Low latency: Discovery queries happen on nearly every request (or at cache refresh). Sub-millisecond reads are essential.
- Scalability: Must handle thousands of services, each with hundreds of instances, sending heartbeats every few seconds.
etcd
etcd is a distributed key-value store that uses the Raft consensus algorithm for strong consistency. It is the backbone of Kubernetes — every cluster state (including service endpoints) is stored in etcd.
# Register a service instance in etcd
etcdctl put /services/payment-service/instances/i-001 \
'{"host":"10.0.1.15","port":8080,"health":"/health","version":"v2.1"}'
# Set a TTL (lease) — instance must renew or be evicted
etcdctl lease grant 30 # 30-second lease
# lease 694d7f0d43c3a01e granted with TTL(30s)
etcdctl put --lease=694d7f0d43c3a01e \
/services/payment-service/instances/i-001 \
'{"host":"10.0.1.15","port":8080}'
# Keep alive — service sends this periodically
etcdctl lease keep-alive 694d7f0d43c3a01e
# Discover all instances of payment-service
etcdctl get /services/payment-service/instances/ --prefix
# /services/payment-service/instances/i-001
# {"host":"10.0.1.15","port":8080}
# /services/payment-service/instances/i-002
# {"host":"10.0.1.16","port":8080}
# Watch for changes (real-time push)
etcdctl watch /services/payment-service/instances/ --prefix
Consul
HashiCorp Consul combines a service registry, health checking, and KV store. It uses gossip protocol (Serf) for membership and Raft for leader election and state replication. Consul is designed specifically for service discovery, unlike etcd which is a general-purpose KV store.
# consul-service.json — service registration config
{
"service": {
"name": "payment-service",
"id": "payment-001",
"port": 8080,
"tags": ["v2.1", "production", "us-east-1"],
"meta": {
"version": "2.1.0",
"protocol": "grpc"
},
"check": {
"http": "http://localhost:8080/health",
"interval": "10s",
"timeout": "3s",
"deregister_critical_service_after": "90s"
}
}
}
# Register via API
curl -X PUT http://consul-server:8500/v1/agent/service/register \
-d @consul-service.json
# Discover healthy instances via DNS
dig @consul-server -p 8600 payment-service.service.consul SRV
# ;; ANSWER SECTION:
# payment-service.service.consul. 0 IN SRV 1 1 8080 i-001.node.dc1.consul.
# payment-service.service.consul. 0 IN SRV 1 1 8080 i-002.node.dc1.consul.
# Discover via HTTP API
curl http://consul-server:8500/v1/health/service/payment-service?passing=true
# Returns JSON with all healthy instances, ports, metadata
ZooKeeper
Apache ZooKeeper uses the ZAB (ZooKeeper Atomic Broadcast) protocol for consensus. It was one of the earliest coordination services (originally built for Hadoop). Services register as ephemeral znodes that automatically disappear when the session ends (heartbeats stop).
# ZooKeeper service registration using ephemeral nodes
from kazoo.client import KazooClient
zk = KazooClient(hosts='zk1:2181,zk2:2181,zk3:2181')
zk.start()
# Create ephemeral sequential node — auto-deleted on disconnect
zk.create(
"/services/payment-service/instance-",
b'{"host":"10.0.1.15","port":8080,"version":"v2.1"}',
ephemeral=True, # disappears when session dies
sequence=True # appends unique suffix: instance-0000000001
)
# Discover all instances
instances = zk.get_children("/services/payment-service")
for inst in instances:
data, stat = zk.get(f"/services/payment-service/{inst}")
print(f"{inst}: {data.decode()}")
# Watch for changes (real-time notification)
@zk.ChildrenWatch("/services/payment-service")
def watch_instances(children):
print(f"Current instances: {children}")
# Re-fetch data for each instance, update local cache
Netflix Eureka
Eureka is an AP system (in CAP terms) — it favors availability and partition tolerance over consistency. If a Eureka server loses connectivity, it enters self-preservation mode and stops evicting instances, preferring stale data to no data. This makes Eureka extremely resilient but means clients may occasionally get stale endpoints.
# Eureka server — application.yml
server:
port: 8761
eureka:
instance:
hostname: eureka-server
client:
registerWithEureka: false
fetchRegistry: false
server:
enableSelfPreservation: true
renewalPercentThreshold: 0.85
evictionIntervalTimerInMs: 60000
# Eureka REST API — query instances
# GET /eureka/apps/PAYMENT-SERVICE
# Returns XML/JSON with all registered instances:
# {
# "application": {
# "name": "PAYMENT-SERVICE",
# "instance": [
# {
# "hostName": "10.0.1.15",
# "port": {"$": 8080, "@enabled": "true"},
# "status": "UP",
# "metadata": {"version": "v2.1.0"}
# }
# ]
# }
# }
Comparison
| Feature | etcd | Consul | ZooKeeper | Eureka |
|---|---|---|---|---|
| Consensus | Raft | Raft + Gossip | ZAB | Peer replication (AP) |
| CAP | CP | CP (default) | CP | AP |
| Data model | Flat KV | KV + Service catalog | Hierarchical znodes | Service-instance hierarchy |
| Health checking | TTL leases | HTTP, TCP, gRPC, script | Ephemeral nodes | Heartbeat (renew lease) |
| DNS support | No (use CoreDNS plugin) | Built-in DNS interface | No | No |
| Language | Go | Go | Java | Java |
| Primary use | Kubernetes, general KV | Service mesh, discovery | Hadoop, Kafka coordination | Spring Cloud microservices |
| Watch support | Yes (streaming) | Yes (blocking queries) | Yes (watchers) | Yes (polling) |
Health Checks
A registry full of dead instances is worse than no registry at all. Health checks are the mechanism that keeps the registry accurate by continuously verifying that registered instances are actually alive and capable of serving traffic.
Health Check Patterns
1. Heartbeat (Push-Based)
The service instance periodically sends a heartbeat (lease renewal) to the registry. If the registry misses N consecutive heartbeats, it deregisters the instance. Used by Eureka and etcd (TTL leases).
# Eureka heartbeat: instance sends PUT every 30 seconds
PUT /eureka/apps/PAYMENT-SERVICE/i-001
# 200 OK — lease renewed
# If missed for 90 seconds (3 intervals), instance is evicted
2. HTTP Health Endpoint (Pull-Based)
The registry (or a health checker) periodically calls an HTTP endpoint on the service. This is more reliable than heartbeats because it verifies the service can actually handle requests, not just that the process is running.
# Spring Boot Actuator health endpoint
# GET http://payment-service:8080/actuator/health
{
"status": "UP",
"components": {
"db": {
"status": "UP",
"details": {
"database": "PostgreSQL",
"validationQuery": "isValid()"
}
},
"diskSpace": {
"status": "UP",
"details": {
"total": 107374182400,
"free": 85899345920,
"threshold": 10485760
}
},
"redis": {
"status": "UP"
}
}
}
3. TCP Check
Simply attempts a TCP connection to the service port. If the connection succeeds, the service is considered healthy. Less thorough than HTTP checks but works for non-HTTP services (databases, message brokers).
4. gRPC Health Check
// gRPC Health Checking Protocol (standard)
syntax = "proto3";
package grpc.health.v1;
service Health {
rpc Check(HealthCheckRequest) returns (HealthCheckResponse);
rpc Watch(HealthCheckRequest) returns (stream HealthCheckResponse);
}
message HealthCheckRequest {
string service = 1; // empty string = overall health
}
message HealthCheckResponse {
enum ServingStatus {
UNKNOWN = 0;
SERVING = 1;
NOT_SERVING = 2;
SERVICE_UNKNOWN = 3;
}
ServingStatus status = 1;
}
Consul Health Checks in Detail
# consul-checks.json — multiple health checks per service
{
"service": {
"name": "payment-service",
"port": 8080,
"checks": [
{
"name": "HTTP API Health",
"http": "http://localhost:8080/health",
"method": "GET",
"interval": "10s",
"timeout": "3s"
},
{
"name": "Database Connectivity",
"args": ["/usr/local/bin/check-db.sh"],
"interval": "30s",
"timeout": "10s"
},
{
"name": "Memory Usage",
"args": ["/usr/local/bin/check-memory.sh", "80"],
"interval": "15s"
}
]
}
}
# Consul check statuses:
# "passing" — healthy, included in DNS/API results
# "warning" — degraded, still included by default
# "critical" — unhealthy, excluded from results
# After deregister_critical_service_after, fully removed
Kubernetes Probes
Kubernetes has three types of probes, each serving a distinct purpose:
apiVersion: v1
kind: Pod
metadata:
name: payment-service
spec:
containers:
- name: payment
image: payment-service:v2.1
ports:
- containerPort: 8080
# LIVENESS PROBE — "Is the process alive?"
# Failure: kubelet RESTARTS the container
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 15 # wait for app startup
periodSeconds: 10 # check every 10s
timeoutSeconds: 3
failureThreshold: 3 # 3 consecutive failures = restart
# READINESS PROBE — "Can it handle traffic?"
# Failure: pod is removed from Service endpoints (no traffic)
# Pod is NOT restarted — just stops receiving requests
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
failureThreshold: 3
# STARTUP PROBE (K8s 1.16+) — "Has it finished starting?"
# Disables liveness/readiness until startup succeeds
# Prevents slow-starting apps from being killed prematurely
startupProbe:
httpGet:
path: /health/started
port: 8080
failureThreshold: 30 # 30 * 10s = 5 minutes to start
periodSeconds: 10
DNS-Based Discovery
DNS is the oldest and most universal form of service discovery. Every programming language, every framework, and every operating system speaks DNS. Using DNS for service discovery means zero library dependencies for clients.
DNS SRV Records
Standard A/AAAA records only resolve hostnames to IP addresses. SRV records add port information and priority/weight for load balancing:
# SRV record format:
# _service._proto.name TTL class SRV priority weight port target
_http._tcp.payment.prod.example.com. 30 IN SRV 10 60 8080 i-001.example.com.
_http._tcp.payment.prod.example.com. 30 IN SRV 10 30 8080 i-002.example.com.
_http._tcp.payment.prod.example.com. 30 IN SRV 20 10 8080 i-003.example.com.
# priority 10 instances are preferred over priority 20
# within same priority, weight determines distribution:
# i-001 gets ~67% traffic (60/90), i-002 gets ~33% (30/90)
CoreDNS in Kubernetes
CoreDNS is the default DNS server in Kubernetes clusters. It watches the Kubernetes API for Service and Endpoint changes and automatically serves DNS records:
# Every Kubernetes Service gets a DNS entry:
# ..svc.cluster.local
# ClusterIP Service — returns the virtual IP
$ nslookup payment-service.production.svc.cluster.local
Server: 10.96.0.10 # CoreDNS
Address: 10.96.0.10#53
Name: payment-service.production.svc.cluster.local
Address: 10.100.45.123 # ClusterIP (virtual IP)
# Headless Service (clusterIP: None) — returns pod IPs directly
$ nslookup payment-headless.production.svc.cluster.local
Name: payment-headless.production.svc.cluster.local
Address: 10.244.1.15 # Pod 1
Address: 10.244.2.23 # Pod 2
Address: 10.244.3.8 # Pod 3
# SRV records for named ports
$ dig _http._tcp.payment-headless.production.svc.cluster.local SRV
# Returns port and target for each pod
Pros and Cons of DNS-Based Discovery
| Pros | Cons |
|---|---|
| Universal — every language/framework supports DNS | TTL caching causes stale results; clients may hit dead instances |
| No special client libraries needed | DNS doesn’t support sophisticated load balancing |
| Works across organizational boundaries | No built-in health checking (must be layered on top) |
| Low operational overhead | SRV records are not universally supported by all HTTP clients |
networkaddress.cache.ttl=10 in Java).
Service Mesh
A service mesh is a dedicated infrastructure layer for handling service-to-service communication. It moves discovery, load balancing, encryption, observability, and retries out of the application code and into a network proxy (sidecar) that runs alongside every service instance.
Architecture: Data Plane vs Control Plane
Every service mesh has two components:
- Data plane: The sidecar proxies (e.g., Envoy) deployed alongside each service. They intercept all inbound and outbound network traffic. They handle service discovery, load balancing, retries, circuit breaking, mTLS encryption, and metrics collection.
- Control plane: The centralized management layer (e.g., Istiod in Istio, Linkerd control plane). It distributes configuration, certificates, and service discovery data to the data plane proxies.
# Istio VirtualService — traffic management with service discovery
apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
name: payment-service
spec:
hosts:
- payment-service # service discovery name
http:
- match:
- headers:
x-canary:
exact: "true"
route:
- destination:
host: payment-service
subset: v2 # canary version
weight: 100
- route:
- destination:
host: payment-service
subset: v1 # stable version
weight: 90
- destination:
host: payment-service
subset: v2 # canary gets 10% traffic
weight: 10
---
apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
name: payment-service
spec:
host: payment-service
trafficPolicy:
connectionPool:
tcp:
maxConnections: 100
http:
h2UpgradePolicy: DEFAULT
http1MaxPendingRequests: 100
outlierDetection:
consecutive5xxErrors: 3
interval: 30s
baseEjectionTime: 30s
maxEjectionPercent: 50
tls:
mode: ISTIO_MUTUAL # automatic mTLS
subsets:
- name: v1
labels:
version: v1
- name: v2
labels:
version: v2
Major Service Meshes
| Feature | Istio | Linkerd | Consul Connect |
|---|---|---|---|
| Sidecar proxy | Envoy | linkerd2-proxy (Rust) | Envoy (or built-in) |
| Complexity | High | Low | Medium |
| Performance overhead | ~3-5ms p99 latency added | ~1-2ms p99 latency added | ~2-4ms p99 latency added |
| mTLS | Yes (automatic) | Yes (automatic) | Yes (intentions) |
| Traffic splitting | Yes (VirtualService) | Yes (TrafficSplit) | Yes (service-splitter) |
| Multi-cluster | Yes | Yes | Yes (WAN federation) |
When Is the Overhead Worth It?
- Yes: You have 50+ microservices, need mTLS everywhere, need traffic shifting for canary deployments, need distributed tracing without code changes, need to enforce network policies.
- No: You have fewer than 10 services, simple deployment needs, team is small, latency budget is extremely tight, or you’re just starting your microservices journey.
Kubernetes Service Discovery
Kubernetes has the most comprehensive built-in service discovery of any orchestration platform. When you create a Service object, Kubernetes automatically creates DNS entries, manages endpoints, and load-balances traffic.
Service Types
ClusterIP (Default)
Creates a virtual IP (ClusterIP) accessible only within the cluster. kube-proxy programs iptables/IPVS rules to distribute traffic to backend pods.
apiVersion: v1
kind: Service
metadata:
name: payment-service
namespace: production
spec:
type: ClusterIP # default
selector:
app: payment # matches pod labels
version: v2
ports:
- name: http
port: 80 # service port (virtual)
targetPort: 8080 # container port (actual)
protocol: TCP
- name: grpc
port: 9090
targetPort: 9090
# Access from any pod in the cluster:
# http://payment-service.production.svc.cluster.local:80
# or simply: http://payment-service:80 (within same namespace)
Headless Service (clusterIP: None)
No virtual IP is assigned. DNS returns individual pod IPs directly. Used for stateful workloads (databases, Kafka brokers) where clients need to connect to specific pods.
apiVersion: v1
kind: Service
metadata:
name: cassandra
namespace: data
spec:
clusterIP: None # headless!
selector:
app: cassandra
ports:
- port: 9042
# DNS returns all pod IPs:
# cassandra.data.svc.cluster.local → 10.244.1.5, 10.244.2.8, 10.244.3.12
# Individual pods are addressable:
# cassandra-0.cassandra.data.svc.cluster.local → 10.244.1.5
# cassandra-1.cassandra.data.svc.cluster.local → 10.244.2.8
NodePort
Exposes the service on a static port on every node’s IP. External traffic can reach the service via <NodeIP>:<NodePort>.
apiVersion: v1
kind: Service
metadata:
name: payment-service
spec:
type: NodePort
selector:
app: payment
ports:
- port: 80
targetPort: 8080
nodePort: 30080 # accessible at <any-node-ip>:30080
# range: 30000-32767
LoadBalancer
Provisions a cloud provider’s load balancer (AWS NLB/ALB, GCP LB, Azure LB) that routes external traffic to the service.
apiVersion: v1
kind: Service
metadata:
name: payment-service
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "nlb"
service.beta.kubernetes.io/aws-load-balancer-scheme: "internet-facing"
spec:
type: LoadBalancer
selector:
app: payment
ports:
- port: 443
targetPort: 8080
Ingress
An Ingress is not a Service type but a separate resource that provides HTTP/HTTPS routing, TLS termination, and name-based virtual hosting. An Ingress Controller (NGINX, Traefik, HAProxy) watches Ingress resources and configures the actual routing.
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: api-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
ingressClassName: nginx
tls:
- hosts:
- api.example.com
secretName: api-tls
rules:
- host: api.example.com
http:
paths:
- path: /payments
pathType: Prefix
backend:
service:
name: payment-service
port:
number: 80
- path: /orders
pathType: Prefix
backend:
service:
name: order-service
port:
number: 80
- path: /inventory
pathType: Prefix
backend:
service:
name: inventory-service
port:
number: 80
How Pods Find Each Other
Kubernetes provides two mechanisms for pods to discover services:
1. Environment Variables
When a pod starts, kubelet injects environment variables for every active Service in the same namespace:
# Automatically injected into every pod:
PAYMENT_SERVICE_SERVICE_HOST=10.100.45.123
PAYMENT_SERVICE_SERVICE_PORT=80
PAYMENT_SERVICE_PORT=tcp://10.100.45.123:80
PAYMENT_SERVICE_PORT_80_TCP_ADDR=10.100.45.123
PAYMENT_SERVICE_PORT_80_TCP_PORT=80
# Limitation: only services that exist BEFORE the pod starts
# get injected. Services created after pod start are invisible.
2. DNS (Preferred)
CoreDNS dynamically updates as Services and Endpoints change. No restart needed. This is the recommended approach:
# In application code — just use the DNS name
import requests
# Same namespace — short name works
response = requests.get("http://payment-service:80/api/charge")
# Different namespace — use FQDN
response = requests.get(
"http://payment-service.production.svc.cluster.local:80/api/charge"
)
# Headless service — get all pod IPs
import dns.resolver
answers = dns.resolver.resolve(
'cassandra.data.svc.cluster.local', 'A'
)
pod_ips = [str(rdata) for rdata in answers]
# ['10.244.1.5', '10.244.2.8', '10.244.3.12']
The Endpoints and EndpointSlice Objects
Behind every Service is an Endpoints object (or EndpointSlice in modern K8s) that tracks which pod IPs are ready to receive traffic. When a pod fails its readiness probe, it is removed from the EndpointSlice, and kube-proxy/CoreDNS stop routing traffic to it.
# View endpoints for a service
$ kubectl get endpoints payment-service -n production
NAME ENDPOINTS AGE
payment-service 10.244.1.15:8080,10.244.2.23:8080,10.244.3.8:8080 5d
# EndpointSlice (K8s 1.21+, more scalable)
$ kubectl get endpointslice -l kubernetes.io/service-name=payment-service
NAME ADDRESSTYPE PORTS ENDPOINTS AGE
payment-service-abc12 IPv4 8080 10.244.1.15 + 2... 5d
Summary
- Service discovery solves the problem of locating dynamic service instances in microservices architectures.
- Client-side discovery (Eureka + Ribbon) gives clients direct control, reducing latency but increasing coupling.
- Server-side discovery (AWS ELB, K8s Services) keeps clients simple at the cost of an extra network hop.
- Service registries (etcd, Consul, ZooKeeper, Eureka) store and serve instance locations. Choose CP for strong consistency or AP for maximum availability.
- Health checks (heartbeats, HTTP, TCP, gRPC) keep the registry accurate. Kubernetes uses liveness, readiness, and startup probes.
- DNS-based discovery is universal but limited by TTL caching and basic load balancing.
- Service meshes (Istio, Linkerd) handle discovery + load balancing + mTLS + observability via sidecar proxies.
- Kubernetes provides the most integrated discovery with ClusterIP Services, CoreDNS, headless Services, and EndpointSlices.
With service discovery mastered, we can explore the infrastructure that sits between clients and services: Proxies — Forward, Reverse, and Beyond.