Restart Policy
“https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#restart-policy”
A Pod’s restartPolicy determines how the kubelet behaves when a container terminates. The three values are Always (default), OnFailure, and Never — each applies to a different kind of workload. The restart policy is set at the Pod level and applies to all containers in the Pod (init containers excluded — they always run to completion).
Table of Contents
- The Three Restart Policies
- Restart Policy and Workload Type
- The Restart Backoff Algorithm
- Exit Codes and What They Mean
- Init Containers and Restart Policy
- The kubelet’s Restart Loop
- Container Restart vs Pod Restart
- terminationGracePeriodSeconds and Restart
- livenessProbe and Restart
- Job’s restartPolicy and backoffLimit
- StatefulSet and DaemonSet Restart Policy
- Common Pitfalls
- Operations and Debugging
- Gotchas and Common Mistakes
1. The Three Restart Policies
apiVersion: v1
kind: Pod
metadata: { name: app }
spec:
restartPolicy: Always # default
containers:
- name: app
image: app:1.01.1 Always (default)
The kubelet always restarts the container, regardless of exit code. This is the default for any Pod without an explicit restartPolicy.
spec:
restartPolicy: AlwaysUse for: long-running services (web servers, API servers, daemons).
The container is restarted on:
- Normal exit (0).
- Error exit (non-zero).
- Crash (signal, segfault).
- OOM-kill.
- Liveness probe failure.
The kubelet restarts indefinitely. There is no limit on restarts. Use maxRetries on a Job, or rely on the backoff algorithm to space out the restarts.
1.2 OnFailure
The kubelet restarts the container only if it exits with a non-zero status. A successful exit (code 0) leaves the container stopped.
spec:
restartPolicy: OnFailureUse for: batch jobs, one-shot tasks that should retry on failure. (This is the default for Jobs.)
The container is restarted on:
- Non-zero exit code.
- Crash.
- OOM-kill.
- Liveness probe failure.
The container is NOT restarted on:
- Exit code 0.
- The Pod being deleted.
1.3 Never
The kubelet never restarts the container after it terminates.
spec:
restartPolicy: NeverUse for: one-shot tasks that should not retry. Less common — usually OnFailure is preferred so the task retries on transient failures.
The container is not restarted for any reason. Once it exits, the kubelet records the exit and moves on. The Pod’s status reflects the final state.
2. Restart Policy and Workload Type
The restartPolicy should match the workload:
| Workload | Typical restartPolicy | Why |
|---|---|---|
| Deployment (web server) | Always (default) | Long-running, must stay up |
| StatefulSet (DB) | Always (default) | Long-running, must stay up |
| DaemonSet (node agent) | Always (default) | Long-running, must stay up |
| Job (batch task) | OnFailure | Should retry on failure, but not on success |
| CronJob (scheduled task) | OnFailure (via Job) | Same as Job |
| One-shot Pod | Never | Should not retry |
| Init container | (n/a) | Init containers always run to completion |
The restartPolicy is set by the controller (Deployment, Job, etc.) when it creates the Pod. You can override it in the Pod template, but you usually shouldn’t.
2.1 What each controller sets
| Controller | Default restartPolicy | Override possible? |
|---|---|---|
| Deployment | Always | Yes, but rare |
| StatefulSet | Always | Yes, but rare |
| DaemonSet | Always | Yes, but rare |
| Job | OnFailure | Yes, also Never |
| CronJob | OnFailure (via Job) | Yes |
| Bare Pod | Always | Yes |
3. The Restart Backoff Algorithm
When a container is restarted, the kubelet waits an increasing amount of time between restarts. This is the exponential backoff algorithm.
Restart 1: wait 10s
Restart 2: wait 20s
Restart 3: wait 40s
Restart 4: wait 80s
Restart 5: wait 160s
Restart 6: wait 300s (5 min) ← cap
Restart 7+: wait 300s
The backoff starts at 10s, doubles each time, and caps at 5 minutes. Once the cap is hit, all subsequent restarts are 5 min apart.
The backoff resets after 10 minutes of successful running. A container that runs for 10 min without restarting has its backoff reset.
3.1 The CrashLoopBackOff
A container that crashes repeatedly enters CrashLoopBackOff. The kubelet:
- Restarts the container.
- Waits the backoff time.
- Container crashes again.
- Backoff doubles (up to 5 min).
- Repeat.
The Pod is Running (the kubelet is actively managing it), but the container keeps crashing. The kubectl get pod shows CrashLoopBackOff in the STATUS column.
3.2 The kubelet’s flags
The backoff is configurable on the kubelet:
# kubelet flags
--node-status-update-frequency=10s
--node-monitor-grace-period=40sBut the backoff algorithm itself (10s → 300s, doubling) is built-in. You can only change the cap (via --node-monitor-grace-period indirectly) or override the wait per-container via the restartPolicy itself.
4. Exit Codes and What They Mean
The container’s exit code is what determines whether the kubelet restarts (under OnFailure or Always).
| Exit code | Meaning | When |
|---|---|---|
| 0 | Success | App explicitly exited 0 |
| 1 | General error | App’s error path |
| 2 | Misuse of shell builtins | Shell script bug |
| 126 | Command cannot execute | Permissions |
| 127 | Command not found | Typo |
| 128 + N | Killed by signal N | Signal (e.g. 137 = SIGKILL, 143 = SIGTERM) |
| 137 | SIGKILL (9) | OOM-kill, kubectl delete pod --force |
| 139 | SIGSEGV (11) | Segfault |
| 143 | SIGTERM (15) | kubectl delete pod, normal termination |
4.1 Common exit codes you’ll see
- 0 — clean shutdown.
OnFailuredoesn’t restart.Alwaysdoes restart. - 137 — OOM-killed or force-killed.
OnFailurerestarts. The container that OOM-killed will probably OOM-kill again. - 139 — segfault.
OnFailurerestarts. The app has a bug. - 143 — graceful termination. The container handled SIGTERM and exited.
OnFailuredoesn’t restart on 0… wait, 143 is 128 + 15, which is signal 15 (SIGTERM), so it’s a non-zero status.OnFailuredoes restart.
Actually, the rule is: exit code 0 = success, anything else = failure. Even 143 (terminated by signal) is “non-zero” and triggers OnFailure restart.
4.2 The signal exit code formula
exit_code = 128 + signal_number
So:
- 137 = 128 + 9 (SIGKILL)
- 143 = 128 + 15 (SIGTERM)
- 139 = 128 + 11 (SIGSEGV)
If the container was killed by a signal, the exit code is 128 + signal_number.
4.3 The special case: successThreshold and readiness
For livenessProbe, the kubelet considers the container healthy only after successThreshold consecutive successes. For readinessProbe, the same.
A liveness probe failure is a kill. The kubelet kills the container (with SIGKILL, exit code 137) and restarts it.
A readiness probe failure is not a kill. The kubelet removes the Pod from the Service’s endpoints. The container keeps running.
5. Init Containers and Restart Policy
Init containers have a different restart policy from regular containers — they always run to completion. If an init container fails, the Pod’s regular containers don’t start, and the init container is not restarted (the Pod is restarted by the kubelet under the Pod’s restartPolicy).
Pod lifecycle:
1. Init container 1 starts
2. Init container 1 fails
3. Pod is restarted (per Pod's restartPolicy)
4. Init container 1 starts again
5. ... (repeat until init succeeds or Pod is deleted)
Init containers don’t have their own restartPolicy — they always run to completion. The Pod’s restartPolicy controls what happens to the Pod when an init container fails.
A Job with init containers: if the init fails, the Job retries (per backoffLimit).
6. The kubelet’s Restart Loop
The kubelet runs a loop per container:
1. Start the container
2. Wait for it to exit
3. If restartPolicy says restart:
a. Apply backoff
b. Go to step 1
4. If restartPolicy says don't restart:
a. Mark the container as terminated
b. Exit the loop
The loop runs forever for Always. For OnFailure, it runs until the container exits with 0. For Never, it runs only once.
The kubelet’s restart is at the container level, not the Pod level. The Pod stays Running; the container is restarted. The Pod’s IP doesn’t change on container restart. The container’s filesystem is also preserved (depending on the volume config).
7. Container Restart vs Pod Restart
A container restart is not the same as a Pod restart:
-
Container restart — the kubelet restarts the container in place. The Pod’s IP is the same. The container’s filesystem is preserved. No external disruption (the Pod’s Service routing is unaffected, but the brief moment during restart is “down”).
-
Pod restart — the Pod is deleted and a new one is created. The new Pod has a new IP. The container starts fresh. The Service routing updates.
A container restart is involuntary (the kubelet does it). A Pod restart is voluntary (you do it, or a controller does).
kubectl rollout restart triggers a Pod restart (rolls the Deployment). kubectl delete pod triggers a Pod restart. Container restarts happen automatically.
8. terminationGracePeriodSeconds and Restart
terminationGracePeriodSeconds is the time the kubelet gives a container to shut down gracefully after SIGTERM.
spec:
terminationGracePeriodSeconds: 30
containers:
- name: app
image: app:1.0
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "sleep 5"]When the kubelet wants to stop a container (for restart, eviction, etc.):
- Sends SIGTERM.
- Waits up to
terminationGracePeriodSeconds(default 30s). - Sends SIGKILL if the container hasn’t exited.
The container’s preStop hook (if any) runs before SIGTERM. The app should handle SIGTERM by closing connections, finishing in-flight work, and exiting.
8.1 The interaction with restart
For container restart:
terminationGracePeriodSecondsis the time the kubelet waits for graceful shutdown.- The container gets SIGTERM, has 30s to exit, then SIGKILL.
- If the container exits within 30s, the kubelet starts the new container immediately.
- If not, the kubelet SIGKILLs the old container and starts the new one.
For Pod deletion (e.g. kubectl delete pod):
- Same as above. The Pod’s
terminationGracePeriodSecondsapplies.
9. livenessProbe and Restart
The livenessProbe is the kubelet’s check for “is this container still working”. If the probe fails repeatedly, the kubelet kills and restarts the container.
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
failureThreshold: 3If the probe fails 3 times in a row (over 30s), the kubelet kills the container (SIGKILL, exit 137) and restarts it.
9.1 livenessProbe vs readinessProbe
- livenessProbe — “is the container alive?” If no, restart. Restart-on-failure.
- readinessProbe — “is the container ready to serve traffic?” If no, remove from Service. Don’t restart.
Common pattern: a slow app takes a while to start. Use initialDelaySeconds to give it time. The liveness probe shouldn’t fire during startup.
9.2 The liveness probe trap
A liveness probe that’s too strict will restart the container unnecessarily. A liveness probe that checks downstream dependencies (e.g. “is the database reachable?”) will restart the container when the DB has a hiccup — which doesn’t fix the DB issue, just thrashes the Pod.
Liveness probes should check “am I still functional”, not “are my dependencies up”. For dependency checks, use readiness probes.
10. Job’s restartPolicy and backoffLimit
A Job’s Pods use restartPolicy: OnFailure (or Never) by default. The Job also has a backoffLimit:
apiVersion: batch/v1
kind: Job
metadata: { name: my-job }
spec:
backoffLimit: 6 # retry the Pod up to 6 times
template:
spec:
restartPolicy: OnFailure
containers:
- name: worker
image: worker:1.0The Job controller:
- Creates a Pod.
- The Pod runs. If it fails, the kubelet restarts the container (per
OnFailure). - If the Pod’s container keeps failing, the kubelet gives up (per the backoff algorithm).
- The Job controller counts the failure, increments the retry counter.
- If the retry counter exceeds
backoffLimit, the Job is marked asFailed. The Pod is left in a failed state.
10.1 The Pod failure policy (k8s 1.26+)
For more granular control:
apiVersion: batch/v1
kind: Job
metadata: { name: my-job }
spec:
backoffLimit: 6
podFailurePolicy:
rules:
- action: FailJob
onExitCodes:
containerName: worker
operator: In
values: [42] # exit 42 = fail the Job
- action: Ignore
onExitCodes:
containerName: worker
operator: In
values: [137] # exit 137 = ignore (OOM is normal in our app)
- action: Count
onExitCodes:
containerName: worker
operator: In
values: [1] # exit 1 = count toward backoffLimitThis lets you distinguish “intentional failure” (exit 42 = fail) from “OOM” (exit 137 = ignore) from “transient error” (exit 1 = retry).
11. StatefulSet and DaemonSet Restart Policy
11.1 StatefulSet
A StatefulSet’s Pods use restartPolicy: Always (the default). The StatefulSet controller creates Pods with stable identities. When a Pod is restarted (container restart), the Pod’s identity is preserved.
If the Pod is deleted and recreated (Pod restart, not container restart), the StatefulSet controller creates a new Pod with the same ordinal. The PVC is bound to the new Pod.
11.2 DaemonSet
A DaemonSet’s Pods use restartPolicy: Always. The DS controller ensures one Pod per node. When a node dies, the DS Pod is gone; when the node returns, the DS Pod is recreated.
The DS controller also handles rolling updates — when the DS template changes, the controller rolls the Pods one at a time, respecting maxUnavailable and maxSurge.
12. Common Pitfalls
12.1 The “container keeps restarting” loop
A container that crashes immediately on every restart is in CrashLoopBackOff. The Pod is Running but the container isn’t.
Common causes:
- Bad config (missing env vars, wrong image, etc.)
- App startup error (database not reachable, port in use, etc.)
- Liveness probe failure (probe checks something that fails on startup)
Fix: kubectl logs <pod> --previous to see the previous container’s logs. The crash usually leaves a stack trace.
12.2 The “livenessProbe too strict” trap
A liveness probe that returns 503 during normal operation restarts the container unnecessarily. Liveness probes should be a stable check, not a “is everything perfect” check.
# too strict
livenessProbe:
httpGet:
path: /health/full # checks all dependencies
# better
livenessProbe:
httpGet:
path: /health/live # checks only "am I running"
readinessProbe:
httpGet:
path: /health/ready # checks "am I ready for traffic"12.3 The “OnFailure” + exit 0 trap
A container that exits 0 under OnFailure is not restarted. The Pod’s status is Succeeded. The container is stopped.
If the app accidentally exits 0 when it shouldn’t, the Pod is “succeeded” but not actually working. Use a liveness probe or a startup script that exits non-zero on failure.
12.4 The “Always” + Job trap
A Job’s Pod uses OnFailure by default. If you set restartPolicy: Always, the Pod will restart on success (exit 0), which means the Job never completes.
Don’t use Always for Jobs.
13. Operations and Debugging
13.1 Common commands
# check a Pod's restart count
kubectl get pod <pod>
# RESTARTS column shows the count
# check the previous container's logs
kubectl logs <pod> --previous
# check the events
kubectl describe pod <pod>
# look at the restart history in "Events"
# check the container's last state
kubectl get pod <pod> -o jsonpath='{.status.containerStatuses[*].lastState}'
# shows terminated: { reason: OOMKilled | Error | Completed, exitCode: 137 }13.2 The “container restart loop” checklist
# 1. Why did the container exit?
kubectl logs <pod> --previous
# look for error messages, stack traces
# 2. Is the liveness probe killing it?
kubectl describe pod <pod>
# look at recent events: "Liveness probe failed"
# 3. Is the container OOM-killing?
kubectl describe pod <pod>
# look at "Last State": reason: OOMKilled
# 4. Is the restart count growing?
kubectl get pod <pod>
# RESTARTS column
# if it's growing, the container is in a restart loop13.3 The “container won’t restart” case
A container exited and isn’t restarting. This happens with OnFailure and exit code 0, or with Never.
# check the Pod's phase
kubectl get pod <pod> -o jsonpath='{.status.phase}'
# Succeeded = exited 0, not restarted (OnFailure)
# Failed = exited non-zero, not restarted (Never)
# Running = still running14. Gotchas and Common Mistakes
14.1 The 25+ common mistakes
-
Alwaysis the default for any Pod without explicitrestartPolicy. Even bare Pods use it. Job’s Pods override toOnFailure. -
Init containers always run to completion. They don’t have a
restartPolicy. The Pod’srestartPolicycontrols what happens when an init container fails. -
OnFailurerestarts on any non-zero exit code. Including 137 (OOM) and 143 (SIGTERM). If you don’t want to restart on OOM, useNeverand handle restarts elsewhere. -
OnFailuredoes NOT restart on exit 0. The container is “succeeded” and the Pod’s phase isSucceeded. The Pod stays in the cluster (until the controller deletes it). -
Neverdoes not retry. A single failure = Pod Failed. No automatic retry. -
The restart backoff caps at 5 min. A container in CrashLoopBackOff will restart every 5 min after the cap. Use this to your advantage — if you have a flaky app, the backoff prevents a tight restart loop.
-
The backoff resets after 10 min of successful running. A container that runs for 10 min has its backoff cleared. The next crash starts a new backoff cycle.
-
Container restart is not a Pod restart. The Pod’s IP is the same. The container’s filesystem may be preserved (depending on volumes). External disruption is minimal.
-
The kubelet is the only thing that restarts containers. The apiserver doesn’t. Other controllers don’t. The kubelet watches the container and restarts it locally.
-
The kubelet’s restart is at the container level, not the Pod. For Pod-level restart, the controller (Deployment, etc.) creates a new Pod.
-
The liveness probe failure is treated as a kill. The kubelet kills the container (SIGKILL, exit 137) and restarts it. The container’s
preStophook does not run on liveness failure. -
The readiness probe failure is not a kill. The kubelet removes the Pod from the Service’s endpoints. The container keeps running.
-
A liveness probe that’s too strict causes restarts. Use a stable check, not a “is everything perfect” check.
-
The
preStophook has its ownterminationGracePeriodSecondssemantics. The preStop runs, then the kubelet sends SIGTERM, then waits for graceful exit. -
The container’s
terminationMessagePathcaptures the last log lines. If the container crashed, the last few lines of output are in/dev/termination-log(or similar). The kubelet reads them and shows them inkubectl describe pod. -
A Job with
restartPolicy: Alwayswill never complete. Because it restarts on success (exit 0). Always useOnFailureorNeverfor Jobs. -
A Pod with
restartPolicy: Neverand an init container that fails will be in a permanent fail state. The init container is not retried. Delete the Pod to retry. -
A
DaemonSet’s Pods useAlways. When a node dies, the DS Pod dies. When the node returns, the DS Pod is recreated. The DS controller handles the recreation, not the kubelet’s restart. -
The kubelet’s restart is per-container. A multi-container Pod’s containers restart independently. If one container is in CrashLoopBackOff, the others can be running fine.
-
The kubelet doesn’t restart containers that were killed by the user (e.g.
kubectl exec ... kill 1). The kubelet treats user-initiated kills as intentional. -
A liveness probe with
initialDelaySeconds: 0andfailureThreshold: 1is very aggressive. A slow-starting app will fail the probe immediately on startup, get killed, restart, fail again. The backoff helps, but the right fix is to give the app time to start. -
A Job’s
backoffLimitis on the Pod, not the container. The Job counts Pod failures. A Pod that restarts its container 100 times is 1 Pod failure. -
A CronJob’s
successfulJobsHistoryLimitandfailedJobsHistoryLimitcap the number of old Jobs kept. Default 3 and 1. Set higher if you want history. -
The container’s
imageis immutable — it can’t be changed without recreating the Pod. TherestartPolicydoesn’t help here. -
The
restartPolicyis set at Pod creation. You can’t change it on a running Pod. You have to delete and recreate the Pod.
See also
- Resource Requests & Limits — OOM-kill is a major cause of restarts
- Pods — what restart policy applies to
- ReplicaSets — the controllers that create Pods
- Services — readiness probes affect Service routing