EndpointSlices
“https://kubernetes.io/docs/concepts/services-networking/endpoint-slices/”
An EndpointSlice is the modern, scalable replacement for the Endpoints API. It tracks the set of network endpoints (typically a Pod IP + port) that back a Service. As of k8s 1.21, EndpointSlices are the default; the legacy Endpoints API is auto-managed for backward compatibility.
The problem with Endpoints
Before k8s 1.21, every Service had a single Endpoints object that listed all its backend Pods. With a Service backed by 5,000 Pods, the Endpoints object was 5,000 entries. With 1,000 Services, each backed by 100 Pods, you had 100,000 entries spread across 1,000 Endpoints objects.
The problems:
- Large objects. A 5,000-endpoint Endpoints object is ~500 KB. Updating it requires sending the whole thing. API server traffic scales with the number of Services × endpoints per Service.
- Watch amplification. Every controller watching Endpoints (and the kube-proxy on every node does) gets a notification on every change. A rolling update of a 1,000-replica Deployment generates 1,000+ Endpoints updates.
- No sharding. The whole Endpoints object is one etcd key. A single point of contention.
The EndpointSlice solution
EndpointSlices shard the endpoint set for a Service into multiple smaller objects:
Service "frontend" (selector: app=frontend, 1000 Pods)
├── EndpointSlice 1 (200 endpoints)
├── EndpointSlice 2 (200 endpoints)
├── EndpointSlice 3 (200 endpoints)
├── EndpointSlice 4 (200 endpoints)
└── EndpointSlice 5 (200 endpoints)
Each EndpointSlice is small (default 100 endpoints), and the total set is sharded across multiple objects. Updates to one Pod only change the slice that contains it.
Anatomy
apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
name: frontend-abcde # auto-generated
generateName: frontend- # the controller sets this
labels:
kubernetes.io/service-name: frontend
endpointslice.kubernetes.io/managed-by: endpointslice-controller.k8s.io
namespace: production
addressType: IPv4
ports:
- name: http
port: 80
protocol: TCP
endpoints:
- addresses:
- 10.244.1.5
conditions:
ready: true
serving: true
terminating: false
nodeName: node-1
zone: us-east-1a
targetRef:
kind: Pod
name: frontend-aaa
namespace: production
- addresses:
- 10.244.2.7
conditions:
ready: true
serving: true
terminating: false
nodeName: node-2
zone: us-east-1b
targetRef:
kind: Pod
name: frontend-bbb
namespace: production
# ... up to 100 endpoints per sliceKey fields
| Field | What it means |
|---|---|
addressType | IPv4, IPv6, or FQDN |
ports | List of Service ports the slice handles |
endpoints[].addresses | The IP addresses (Pod IPs, for IPv4/IPv6) |
endpoints[].conditions.ready | Is this endpoint ready? Mirrors the Pod’s readiness |
endpoints[].conditions.serving | Is the endpoint configured to serve traffic (different from ready during shutdown)? |
endpoints[].conditions.terminating | Is the endpoint in the process of terminating? |
endpoints[].nodeName | The node the Pod is on — used for topology-aware routing |
endpoints[].zone | The zone the Pod is in — used for zonal awareness |
endpoints[].targetRef | The Pod (or other object) this endpoint represents |
The kubernetes.io/service-name label links the slice to the Service. All slices for a Service have the same label.
The terminating / serving / ready dance
Endpoint conditions matter most during Pod shutdown:
| Pod state | ready | serving | terminating |
|---|---|---|---|
| Running, ready | true | true | false |
| Running, not ready (readiness probe failing) | false | true | false |
| Pod being deleted, still has endpoints | true | true | true |
| Pod being deleted, no longer has endpoints | false | false | true |
The controller transitions an endpoint from (ready=true, terminating=false) to (ready=false, terminating=true) to (serving=false, terminating=true) based on the Pod’s state.
This lets kube-proxy gracefully remove the endpoint from iptables while the Pod’s preStop hook runs, draining in-flight connections.
How it interacts with Services
A Service has a selector that defines which Pods are its endpoints. The endpoints-controller (in kube-controller-manager) translates this selector into EndpointSlices:
- Watch all Pods in the namespace
- For each Pod matching the Service’s selector, add an entry to an EndpointSlice
- Shard into slices of
maxEndpointsPerSlice(default 100) - Update slices when Pods are added, removed, or change readiness
You can also have manually-managed EndpointSlices (no Service selector), which is useful for:
- Services that point to external IPs (databases outside the cluster)
- Headless services with custom endpoint logic
How kube-proxy uses them
kube-proxy on every node watches EndpointSlices (not Endpoints anymore, as of k8s 1.22+). When a slice changes, kube-proxy updates its iptables / IPVS rules:
- New endpoint → add DNAT rule from Service ClusterIP to the new Pod IP
- Endpoint removed (readiness=false) → remove the DNAT rule
- Endpoint unchanged → no update (this is the win — most changes are small)
Watch efficiency: kube-proxy only needs to process the slice that changed, not the whole Service’s endpoint set. This is the main scalability win.
How to view
# all EndpointSlices
kubectl get endpointslices -A
# for a specific Service
kubectl get endpointslices -l kubernetes.io/service-name=frontend
# detailed view
kubectl describe endpointslice <name> -n <ns>
# the legacy view (still works)
kubectl get endpoints frontendThe legacy Endpoints API still exists for backward compatibility. It’s auto-generated from the EndpointSlices. Use EndpointSlices in tooling.
Manually creating EndpointSlices
For Services that point to external resources (a database outside the cluster), you create EndpointSlices by hand. The Service has no selector; you create the slices yourself.
apiVersion: v1
kind: Service
metadata:
name: external-db
spec:
ports:
- port: 5432
targetPort: 5432
---
apiVersion: discovery.k8s.io/v1
kind: EndpointSlice
metadata:
name: external-db-1
labels:
kubernetes.io/service-name: external-db
addressType: IPv4
ports:
- name: http
port: 5432
protocol: TCP
endpoints:
- addresses:
- 10.0.5.10 # the actual database IP, outside the cluster
conditions:
ready: trueThis is the right way to expose an external service (RDS, ElastiCache, a VM-hosted DB) to in-cluster clients.
For the headless Service with manual endpoints pattern (StatefulSets use this), you can also let the EndpointSlice controller generate slices from Pod selectors — same as a regular Service, just with clusterIP: None.
The k8s 1.21+ migration
When you upgrade to k8s 1.21+, EndpointSlices are automatically created for every Service. You don’t need to do anything. The legacy Endpoints objects are still maintained for backward compatibility.
For new code, use EndpointSlices. The Endpoints API is not deprecated yet, but all new development targets EndpointSlices.
Slice size and topology
The endpoints-controller can be configured (via kube-controller-manager flags) to:
--max-endpoints-per-slice(default 100) — how many endpoints per slice- The controller automatically balances across slices, splitting when a slice grows past the limit
You can also create slices by topology:
# Zone A
endpoints:
- addresses: [10.244.1.5]
zone: us-east-1a
# Zone B
- addresses: [10.244.2.7]
zone: us-east-1bThis is used by topology-aware routing (k8s 1.27+, beta in 1.21 as service.kubernetes.io/topology-mode: Auto on the Service). The Service’s traffic prefers endpoints in the same zone as the client, reducing cross-zone data transfer costs.
Gotchas
- EndpointSlices are auto-managed for Services with selectors. You almost never create them by hand unless you’re pointing to external resources.
- The
kubernetes.io/service-namelabel is set by the controller. If you create a slice by hand, you must set this label or no one will find it. - Watch permissions matter.
kube-proxyneedswatchpermission onendpointslicesin the cluster. In restricted RBAC setups, this can break if you don’t grant it explicitly. - A Service can have EndpointSlices from multiple sources — the controller-generated ones and manual ones. They all merge into the Service’s effective endpoint set.
- Endpoint conditions don’t match Pod conditions 1:1.
serving=true, terminating=trueis a valid state — the Pod is shutting down, but still has endpoints in the API. The transition logic is documented in the EndpointSlice conditions KEP. - Removing a label from a Pod that the Service selects removes the endpoint, even if the Pod is still running. The selector match is re-evaluated on every Pod change.
- Removing the Service’s selector doesn’t delete existing EndpointSlices. They become orphaned (the
kubernetes.io/service-namelabel still points to a Service that doesn’t select them, but no one cleans up). You can delete the slices manually. - IPv4 and IPv6 endpoints in the same Service must be in different slices with different
addressTypevalues. A single slice can only have one address family. - The
targetRefis informational. It doesn’t enforce anything. If a Pod is deleted but the EndpointSlice still references it, the endpoint is just unreachable.
When to care
- If you’re writing a controller that watches Services, watch EndpointSlices (not Endpoints).
- If you’re writing a service mesh / proxy that needs the endpoint set, watch EndpointSlices.
- If you’re pointing a Service at external resources, create EndpointSlices by hand.
- If you’re debugging “Service has no endpoints” — look at the EndpointSlices, not the Endpoints.
- If you’re scaling to 1000s of endpoints per Service, EndpointSlices are what makes it possible.