What CrashLoopBackOff Means
CrashLoopBackOff is not an error on its own. It is a status that tells you the container in a Pod started, exited, and Kubernetes is now waiting before it tries again. The kubelet restarts a failing container with an exponential backoff delay (10s, 20s, 40s, and so on, capped at 5 minutes). The "BackOff" part is the wait; the "CrashLoop" part is the repeated exit.
The key point: the container is doing exactly what you told it to do, then terminating. Your job is to find out why the process exits. The status itself never tells you the cause, so do not waste time staring at it. Go straight to the logs and events.
You can confirm the restart pattern with:
$ kubectl get pods -A | grep CrashLoop
$ kubectl get pod <pod-name> -o jsonpath='{.status.containerStatuses[0].restartCount}'
A restart count climbing every minute or two confirms the loop. The official Pod lifecycle documentation explains how the kubelet drives these restart states.
Diagnose the Root Cause
Work through these four signals in order. One of them almost always points at the cause.
Container logs
Start with the current and previous container logs. The previous logs are critical because the running container may have already been killed:
kubectl logs <pod-name>
kubectl logs <pod-name> --previous
kubectl logs <pod-name> -c <container-name> --previous
The --previous flag shows output from the last crashed instance. Use -c when the Pod has more than one container, since logs default to the first container only.
Pod events and state
kubectl describe surfaces scheduling problems, image pull failures, probe failures, and the exact exit reason:
kubectl describe pod <pod-name>
Read the Events section at the bottom and the Last State block under the container status. Last State shows the Reason (for example Error or OOMKilled) and the Exit Code.
Exit codes
The exit code narrows the cause quickly:
kubectl get pod <pod-name> -o jsonpath='{.status.containerStatuses[0].lastState.terminated.exitCode}'
Common values:
-
0: the process exited cleanly. The container ran a short task and finished. Use a Job, not a Deployment, or keep the main process running. -
1: a generic application error. Check the logs. -
137: the process was killed by SIGKILL, usuallyOOMKilledor a failed liveness probe. -
139: a segmentation fault (SIGSEGV) inside the binary. -
143: terminated by SIGTERM during shutdown.
Common Causes and Fixes
Bad command or entrypoint
If there are no application logs at all, the container often never ran your code. A wrong command, args, or a binary that is not on the image PATH produces an immediate exit. Check what the manifest overrides:
kubectl get pod <pod-name> -o jsonpath='{.spec.containers[0].command}'
Fix the command and args in the manifest, or correct the ENTRYPOINT/CMD in the Dockerfile, then rebuild.
Missing config or secret
A container that crashes on startup looking for an environment variable or mounted file is usually missing a ConfigMap or Secret. The Pod events will show CreateContainerConfigError if the reference itself is broken:
kubectl get configmap,secret -n <namespace>
kubectl describe pod <pod-name> | grep -A5 Events
Create the missing object or fix the name in envFrom, valueFrom, or the volume reference.
Failing liveness or readiness probe
A liveness probe that fails repeatedly restarts the container, which looks identical to a crash loop. Look for Liveness probe failed in the events. The usual causes are a probe path that does not exist, a port mismatch, or an initialDelaySeconds that is too short for a slow-starting app.
livenessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
failureThreshold: 3
For apps with long warmups, add a startupProbe so the liveness probe does not fire until startup completes.
OOMKilled or resource limits
Exit code 137 with Reason: OOMKilled means the container exceeded its memory limit and the kernel killed it. The diagnosis flow is close enough to a standalone OOM kill that the step-by-step OOMKilled guide is worth following when memory is the trigger:
kubectl describe pod <pod-name> | grep -i oom
Raise resources.limits.memory, or fix the leak in the app. Set a requests value too so the scheduler places the Pod on a node with enough memory:
resources:
requests:
memory: "256Mi"
limits:
memory: "512Mi"
Dependency not ready
If the container exits because a database or upstream service is unreachable at boot, do not let it crash. Use an initContainer to wait for the dependency, or add retry logic in the app. An init container that blocks until the dependency answers keeps the main container from looping:
initContainers:
- name: wait-for-db
image: busybox:1.36
command: ['sh', '-c', 'until nc -z db 5432; do sleep 2; done']
Image issues
A wrong tag, a private registry without credentials, or a corrupt image shows as ImagePullBackOff or ErrImagePull rather than CrashLoopBackOff, but the two often get confused. Verify the image and pull secret:
kubectl describe pod <pod-name> | grep -i image
Fix the tag, or attach an imagePullSecrets entry to the Pod or ServiceAccount.
Verify the Fix
After editing the manifest, apply it and roll the workload:
kubectl apply -f deployment.yaml
kubectl rollout restart deployment/<deployment-name>
kubectl rollout status deployment/<deployment-name>
Watch the new Pods reach Running and confirm the restart count stays flat:
kubectl get pods -w
kubectl get pod <pod-name> -o jsonpath='{.status.containerStatuses[0].restartCount}'
A restart count that holds steady for several minutes means the loop is broken. For a deeper walkthrough of the root causes behind a stuck container, see the CrashLoopBackOff root-cause breakdown.
Prevent It
- Set both
requestsandlimitsfor CPU and memory so the scheduler and kernel behave predictably. - Add a
startupProbefor slow-starting apps and keep liveness probes lenient. - Test the image locally with
docker runbefore deploying so entrypoint and command errors surface early. - Use
initContainersto gate startup on real dependencies instead of crashing. - Ship a real
/healthzendpoint that reflects actual readiness, not just process liveness. - Pin image tags to digests in production so a re-tagged image cannot break a running Deployment.
Top comments (0)