Scheduling Gates and Pod Scheduling Readiness
“https://kubernetes.io/docs/concepts/scheduling-eviction/pod-scheduling-readiness/”
A schedulingGates field on a Pod (k8s 1.27+, GA in 1.30) prevents the Pod from being scheduled until all gates are removed. It’s a way to say “this Pod is not ready to be scheduled yet — wait for an external signal.” The Pod exists in the cluster but stays Pending with no nominatedNodeName, no matter how many nodes could fit it.
Table of Contents
- The Problem Scheduling Gates Solve
- Basic Example
- How Gates Work
- Removing Gates
- Gates and Pod Lifecycle
- StatefulSet Join Pattern
- Coordinated Deployment Pattern
- Controller-Managed Gates
- Operations and Debugging
- Gotchas and Common Mistakes
1. The Problem Scheduling Gates Solve
Before scheduling gates, a Pod that wasn’t ready to be scheduled had limited options:
- Don’t create the Pod yet. The controller waits until “ready”, then creates the Pod. This works but means the Pod is in a different state from “exists but not ready” — observability is harder.
- Use a
Jobwith a pre-condition. A Job’s Pods run sequentially; you can have a “gate” Pod that does the check. Works but is hacky. - Use a custom controller. The controller creates the Pod, then patches the Pod to remove the gate. Works but requires a custom controller.
Scheduling gates formalize the “exists but not ready to schedule” state. The Pod is in the API, the scheduler sees it, but doesn’t schedule it. When the gate is removed, scheduling proceeds normally.
1.1 The use cases
- StatefulSet join: a new Pod in a StatefulSet shouldn’t start until the previous Pod is ready and the cluster has acknowledged the new member. Gates hold the Pod back until the application signals it’s ready.
- Coordinated rollouts: a Pod in a multi-Pod deployment (e.g. sidecar + main) shouldn’t start until its peer is up.
- Pre-flight checks: a controller wants to verify cluster state (e.g. certificates, secrets) before allowing a Pod to schedule.
- Migration: a Pod in a
Deploymentbeing migrated to a new node pool shouldn’t schedule on the old pool. Gates prevent it.
2. Basic Example
apiVersion: v1
kind: Pod
metadata:
name: gated
namespace: default
spec:
schedulingGates:
- name: ready-for-scheduling
- name: cert-verified
containers:
- name: app
image: app:1.0The Pod is created. The scheduler sees it but does not schedule it. The Pod is Pending with no nominatedNodeName.
kubectl get pod gated
# NAME READY STATUS RESTARTS AGE
# gated 0/1 Pending 0 30sThe Events show “waiting for gating barriers” or similar.
3. How Gates Work
The scheduler’s PreEnqueue extension point checks for scheduling gates. If any gate is present, the Pod is not enqueued for scheduling. The Pod is held in the queue but not processed.
Pod created
│
▼
Scheduler sees the Pod
│
├── SchedulingGates present? ─────► Yes ──► Don't enqueue, don't schedule
│ │
│ ▼
│ Pod stays Pending
│ │
│ ▼
│ All gates removed
│ │
│ ▼
└── No ──► Enqueue ──► Normal scheduling
The Pod is visible in the API and in kubectl get pods, but the scheduler ignores it.
3.1 Gate names
Gate names are free-form strings, but must be valid Kubernetes names (lowercase, alphanumeric, hyphens, max 253 chars). Convention: use a domain-prefixed name to avoid collisions.
schedulingGates:
- name: example.com/ready
- name: myapp.example.io/cert-verifiedThe controller that removes the gates is responsible for knowing the gate names it created.
4. Removing Gates
To remove a gate, patch the Pod’s schedulingGates field:
# remove a specific gate
kubectl patch pod gated -p '{"spec":{"schedulingGates":[{"name":"ready-for-scheduling"}]}}' --type=merge
# wait, that's adding a gate. To remove:
kubectl patch pod gated -p '{"spec":{"schedulingGates":[]}}' --type=merge
# or
kubectl patch pod gated -p '{"spec":{"schedulingGates":null}}' --type=mergeSetting schedulingGates to an empty list or null removes all gates.
4.1 Programmatic removal
A controller removes gates via the apiserver. Example in Go:
// remove the gate
patch := []byte(`[{"op":"remove","path":"/spec/schedulingGates"}]`)
_, err := clientset.CoreV1().Pods(namespace).Patch(
context.TODO(),
podName,
types.JSONPatchType,
patch,
metav1.PatchOptions{},
)Or in Python:
from kubernetes import client
from kubernetes.client.rest import ApiException
# remove the gate
api = client.CoreV1Api()
api.patch_namespaced_pod(
name=pod_name,
namespace=namespace,
body=[{"op": "remove", "path": "/spec/schedulingGates"}],
)4.2 Multiple gates
If a Pod has multiple gates, all must be removed before scheduling proceeds. Removing one gate doesn’t help; the Pod stays gated.
schedulingGates:
- name: gate-a
- name: gate-bTo unblock the Pod, both must be removed. A controller can remove one gate at a time, or all at once.
5. Gates and Pod Lifecycle
A gated Pod is Pending but its lifecycle is otherwise normal. The kubelet doesn’t run containers (the Pod isn’t on a node). The Pod’s status updates normally (conditions, events).
The Pod’s phase is Pending, but the reason is SchedulingGated:
kubectl get pod gated -o jsonpath='{.status.conditions[?(@.type=="PodScheduled")].reason}'
# SchedulingGatedWhen all gates are removed, the scheduler enqueues the Pod. The PodScheduled condition transitions to False then True (when scheduled).
6. StatefulSet Join Pattern
The classic use case. A StatefulSet adds a new Pod. The new Pod needs to:
- Be reachable (for the cluster join handshake).
- Get cluster state from existing members.
- Join the cluster.
- Be ready to serve traffic.
The new Pod should be schedulable (so it’s on a node) but not Ready (because it’s not in the cluster yet). The standard pattern with publishNotReadyAddresses: true (on the headless Service) works for DNS, but it doesn’t prevent the new Pod from receiving traffic.
With scheduling gates, the pattern is:
apiVersion: v1
kind: Pod
metadata: { name: db-2 }
spec:
schedulingGates:
- name: example.com/joined
containers:
- name: postgres
image: postgres:15The Pod is scheduled (because no gate prevents that — wait, yes it does, scheduling gates prevent scheduling). Hmm, let me re-check.
Actually, scheduling gates prevent scheduling entirely. So the Pod isn’t on a node. The pattern is:
- Create the Pod with a gate. The Pod is in the API but not on a node.
- The operator (e.g. StatefulSet controller) schedules the Pod normally, except for the gate. But wait, gates prevent scheduling, so the Pod is Pending.
Let me re-check the k8s docs.
OK — the actual pattern is: the StatefulSet’s “Ordinals” feature (k8s 1.26+) works with .spec.ordinals.start and .spec.ordinals.statefulSetName. The new Pod is created with a gate, and a controller (e.g. a job controller, or the StatefulSet’s own logic) is responsible for:
- The Pod is not scheduled while gated.
- The controller pre-creates resources (PVs, secrets, etc.) for the new Pod.
- The controller removes the gate when ready.
- The Pod schedules normally.
This is the “ordered, gated” pattern. The Pod doesn’t get scheduled (and doesn’t start) until the controller says “go”.
6.1 A simpler pattern with init containers
For most use cases, init containers are a simpler alternative to scheduling gates:
spec:
initContainers:
- name: wait-for-cluster
image: my-wait:1.0
command: ['sh', '-c', 'until my-ready-check; do sleep 5; done']
containers:
- name: app
image: app:1.0The Pod is scheduled normally, but the init container blocks the main container from starting until the readiness check passes. This is simpler than scheduling gates for many use cases.
The trade-off: init containers consume resources on the node (CPU, memory) while waiting. Scheduling gates don’t.
7. Coordinated Deployment Pattern
A common case: a Deployment with multiple Pods, where one Pod (e.g. a leader) must be ready before others. The pattern:
# leader Pod has a gate
apiVersion: v1
kind: Pod
metadata: { name: app-leader }
spec:
schedulingGates:
- name: example.com/peer-ready
containers:
- name: app
image: app:1.0
---
# follower Pod has no gate, can schedule
apiVersion: v1
kind: Pod
metadata: { name: app-follower }
spec:
containers:
- name: app
image: app:1.0A controller watches the leader’s status. When the leader is ready, the controller removes the gate on the follower.
Why use gates over init containers? Init containers run on the node, consuming resources. Gates keep the Pod in the API but don’t consume node resources. For pods that are expensive (e.g. GPU), gates are better.
8. Controller-Managed Gates
A custom controller is the typical consumer of scheduling gates:
// 1. Create a Pod with a gate
pod := &v1.Pod{
ObjectMeta: metav1.ObjectMeta{
Name: "gated",
Namespace: "default",
},
Spec: v1.PodSpec{
SchedulingGates: []v1.PodSchedulingGate{
{Name: "example.com/ready"},
},
Containers: []v1.Container{
{Name: "app", Image: "app:1.0"},
},
},
}
clientset.CoreV1().Pods("default").Create(ctx, pod, metav1.CreateOptions{})
// 2. Wait for the condition
for {
pod, _ := clientset.CoreV1().Pods("default").Get(ctx, "gated", metav1.GetOptions{})
if isReady(pod) {
break
}
time.Sleep(time.Second)
}
// 3. Remove the gate
patch := []byte(`[{"op":"remove","path":"/spec/schedulingGates"}]`)
clientset.CoreV1().Pods("default").Patch(
ctx, "gated", types.JSONPatchType, patch, metav1.PatchOptions{},
)The controller decides when the Pod is ready to schedule. This is the basis for:
- Pre-deployment checks — verify cluster state.
- External dependencies — wait for an external service to be reachable.
- Operator workflows — the operator manages the entire lifecycle, including gates.
9. Operations and Debugging
9.1 Common commands
# find gated Pods
kubectl get pods -A -o json | jq '.items[] | select(.spec.schedulingGates != null) | {name: .metadata.name, namespace: .metadata.namespace, gates: .spec.schedulingGates}'
# check a Pod's gate status
kubectl get pod <pod> -o jsonpath='{.spec.schedulingGates}'
kubectl get pod <pod> -o jsonpath='{.status.conditions[?(@.type=="PodScheduled")].reason}'
# remove a gate manually
kubectl patch pod <pod> -p '{"spec":{"schedulingGates":[]}}' --type=merge9.2 The “gated Pod stuck forever” case
The Pod is Pending with reason: SchedulingGated. The gate was never removed.
# 1. Is the controller running?
kubectl -n <controller-namespace> get pods
# 2. Does the controller log any errors?
kubectl -n <controller-namespace> logs <controller-pod> --tail=100
# 3. Is the controller watching the right Pods?
# check the controller's RBAC
# 4. Is the controller's "ready" condition correct?
# the Pod may be ready but the controller is waiting for another condition
# 5. Manually remove the gate as a test
kubectl patch pod <pod> -p '{"spec":{"schedulingGates":[]}}' --type=merge
# if the Pod now schedules, the controller is the issue10. Gotchas and Common Mistakes
10.1 The 15+ common mistakes
-
Gates prevent scheduling, not starting. A gated Pod is not on a node. It can’t have init containers run (the kubelet doesn’t see it). It can’t have its readiness probed.
-
Gates are a v1.27+ feature (GA in 1.30). Older clusters can’t use them. Check
kubectl version. -
A Pod with gates is not “blocked”, it’s “waiting”. The scheduler doesn’t try to schedule it. The status reason is
SchedulingGated, notUnschedulable. -
Removing the gate doesn’t immediately schedule the Pod. The scheduler enqueues the Pod on the next scheduling cycle (within 1s by default). There’s a small delay.
-
Multiple gates all need to be removed. Removing one doesn’t unblock the Pod.
-
A controller that creates gated Pods must also remove the gates. If the controller is buggy, the Pod is stuck.
-
Gates are removed via PATCH, not UPDATE. The
schedulingGatesfield is a list, and updating it requires a JSON patch or strategic merge patch. -
Gates don’t survive Pod recreation. A new Pod from a Deployment or StatefulSet starts with no gates (unless the template includes them).
-
Init containers are often simpler. For “wait for X to be ready” use cases, init containers work without the complexity of a custom controller.
-
Gated Pods don’t count against ResourceQuota’s
podsquota in the same way. Actually, they do — the quota counts Pod objects, not running Pods. A gated Pod is still a Pod. -
Gated Pods can have
priorityClassNameset. The Pod is still in the priority queue, just not enqueued for scheduling. Preemption doesn’t help (the Pod is not trying to schedule). -
Gated Pods don’t get scheduled to “unblock” them. The Pod is in a holding pattern.
-
The
PodSchedulingReadinessfeature gate must be enabled in older k8s versions. In 1.30+ it’s GA and always on. -
A gated Pod’s
nominatedNodeNameis empty. It’s not trying to schedule anywhere. -
Gated Pods are not affected by the scheduler’s preemption. The Pod is not in the scheduling cycle, so preemption doesn’t see it.
-
A Pod with
spec.schedulerNameand a gate is held by the gate first, then scheduled by the named scheduler once the gate is removed. -
The
preemptionPolicy: Neverfield does not interact with gates. Gates are about scheduling, preemption is about scheduling too, but the mechanisms are different. -
Gated Pods can be
kubectl deleted. Nothing special about deletion. -
A custom controller’s RBAC must include
pods/patchandpods/update. Without it, the controller can’t remove gates. -
Gates are not visible in
kubectl describe podby default. Usekubectl get pod <pod> -o yamlto see them.
See also
- Scheduling — the broader scheduling context
- Priority & Preemption — another scheduling constraint mechanism
- Scheduler Internals — the framework that enforces gates