Load Balancing

Load balancers sit in front of your services and distribute traffic across multiple backend instances — improving availability and scalability.


How It Works

Client
 │
   ▼
┌─────────────────┐
│  Load Balancer   │ ← health checks backends
│  (LB Algorithm)  │ ← routes to healthy instances
└────────┬────────┘
         │
    ┌────┴────┐
    ▼         ▼
┌──────┐  ┌──────┐
│server│ │server│
│  1   │  │  2   │
└──────┘  └──────┘

Algorithms

Round Robin

Cycles through backends in order. Best for backends with similar capacity.

Request1 → server-1
Request 2 → server-2
Request 3 → server-3
Request 4 → server-1 (repeat)

Weighted Round Robin

Assign weights to backends based on capacity.

server-1 (weight=3): gets3x more traffic
server-2 (weight=2): gets 2x more traffic
server-3 (weight=1): gets1x

Least Connections

Routes to the backend with the fewest active connections.

backend-1: 47 active connections ← routes here
backend-2: 12 active connections
backend-3: 33 active connections

Best when requests have variable duration (long-running vs short).

IP Hash

Hash the client IP to always route the same client to the same backend.

hash(client_ip) % num_backends = target_backend

Used for session affinity — but generally avoid unless you have a specific need.

Least Response Time

Routes to the backend with the lowest average response time.


Health Checks

Load balancers must detect and remove unhealthy backends.

# AWS ALB health check example
Target: HTTP:8080/health
Interval: 30 seconds
Timeout: 5 seconds
Healthy threshold: 2 consecutive successes
Unhealthy threshold: 2 consecutive failures

Types

TypeWhat It ChecksExample
TCP connectPort opennc -z backend:8080
HTTP/HTTPS/health returns 200curl -f http://backend:8080/health
Deep health checkActual DB connectivityQuery SELECT 1

Deep health checks are more reliable but add load — use them sparingly.


L4 vs L7 Load Balancing

LayerWhat It RoutesUse When
L4 (TCP)By IP + portHigh throughput, simple routing
L7 (HTTP)By URL, headers, cookiesPath routing, auth, canaries
L4: Client → LB → Backend (raw TCP stream)
L7: Client → LB → Backend (HTTP, can inspect headers)

L7 is more flexible for modern microservice architectures. L4 is higher performance for raw throughput.


Common Patterns

1. Client-Side Discovery

Service A ──▶ Service Registry (e.g., Consul)
 │
                    │ (reads list of healthy instances)
                    ▼
              Service B (one of N instances)

2. Server-Side Discovery

Service A ──▶ Load Balancer ──▶ Service B
              (LB handles routing)

3. Canary Deployment

90% traffic ──▶ Production (v1)
10% traffic ──▶ Canary (v2) ← monitored before full rollout

Load balancer weight-based routing enables canary without duplicate infrastructure.

4. Circuit Breaker Integration

# Hystrix-style circuit breaker behind load balancer
# LB removes open circuits automatically
# (Envoy, HAProxy both support this)

AWS/GCP/Azure LB Options

ProviderL4L7Managed
AWSNLBALB
GCPTCP LBHTTP(S) LB
AzureL4 BasicApplication Gateway
HAProxy❌ (self-managed)
Envoy❌ (self-managed)
NGINX❌ (self-managed)

Quick Checklist

□ Health check at /health (or /ready)
□ Graceful shutdown: drain connections before removing backend
□ Idle timeout configured (not too long, not too short)
□ SSL termination at LB (not at every backend)
□ Connection multiplexing (HTTP/2 or keep-alive)
□ Canary routing via weight-based algorithm
□ Circuit breaker at service level (don't rely solely on LB)

Source