ERROR

Kubernetes Troubleshooting Guide: Diagnosing Pod Scheduling Failures

Quick Fix Summary

TL;DR

Check node resources, taints, and affinity rules using `kubectl describe pod` and `kubectl describe node`.

FailedScheduling occurs when the Kubernetes scheduler cannot find a suitable node to place a pod. This prevents pod creation and requires investigation of node conditions and pod requirements.

Diagnosis & Causes

Insufficient CPU or memory resources on all nodes

Node taints that pod tolerations don't match

Pod node affinity/anti-affinity rules too restrictive

No nodes with requested persistent volume access modes

Node selector labels don't match any available nodes

Recovery Steps

Step 1: Get Detailed Scheduling Failure Reason

Use kubectl describe to see the exact scheduling failure message from the scheduler.

bash

kubectl describe pod <pod-name> -n <namespace>
kubectl get events --field-selector involvedObject.name=<pod-name> --sort-by='.lastTimestamp'

Step 2: Check Node Resource Availability

Examine node capacity and allocatable resources to identify resource constraints.

bash

kubectl describe nodes
kubectl get nodes -o custom-columns='NAME:.metadata.name,CPU_ALLOCATABLE:.status.allocatable.cpu,MEM_ALLOCATABLE:.status.allocatable.memory,CPU_CAPACITY:.status.capacity.cpu,MEM_CAPACITY:.status.capacity.memory'

Step 3: Inspect Node Taints and Pod Tolerations

Compare node taints against pod tolerations to identify mismatches.

bash

kubectl describe node <node-name> | grep -A 10 Taints
kubectl describe pod <pod-name> | grep -A 10 Tolerations

Step 4: Verify Node Selectors and Affinity Rules

Check if pod's nodeSelector or nodeAffinity rules match any available nodes.

bash

kubectl get nodes --show-labels
kubectl describe pod <pod-name> | grep -A 20 -B 5 'Node-Selectors\|Affinity'

Step 5: Check Persistent Volume Constraints

Verify persistent volume access modes and node availability for volume mounting.

bash

kubectl get pv
kubectl get pvc -n <namespace>
kubectl describe storageclass

Step 6: Examine Scheduler Logs for Advanced Debugging

Access scheduler pod logs for detailed scheduling decision information.

bash

kubectl logs -n kube-system -l component=kube-scheduler --tail=100
kubectl logs -n kube-system -l component=kube-scheduler --tail=100 | grep -i "<pod-name>"

Step 7: Use kubectl debug to Simulate Scheduling

Create an ephemeral debugging pod to test scheduling constraints in real-time.

bash

kubectl debug node/<node-name> -it --image=busybox
kubectl run test-pod --image=nginx --dry-run=client -o yaml | kubectl apply -f -

Architect's Pro Tip

"FailedScheduling due to 'pod has unbound immediate PersistentVolumeClaims' often means your StorageClass has volumeBindingMode: WaitForFirstConsumer. Use immediate binding or create PVC before pod."

Frequently Asked Questions

What's the difference between FailedScheduling and Pending status?

Pending means the pod is accepted but not yet scheduled. FailedScheduling is a specific event within Pending status where the scheduler explicitly failed to find a suitable node.

How do I temporarily force a pod to schedule on a specific node?

Use kubectl patch pod <pod-name> -p '{"spec":{"nodeName":"<target-node>"}}' but this bypasses the scheduler and should only be used for debugging.

Can FailedScheduling be caused by pod security policies?

Yes, if PodSecurityPolicy admission controller rejects the pod based on security context, it can appear as FailedScheduling. Check kube-apiserver logs for PSP violations.

Related Kubernetes Guides

502 Bad Gateway

Kubernetes Troubleshooting Guide: Diagnosing Pod Scheduling Failures

Quick Fix Summary

Diagnosis & Causes

Recovery Steps

Step 1: Get Detailed Scheduling Failure Reason

Step 2: Check Node Resource Availability

Step 3: Inspect Node Taints and Pod Tolerations

Step 4: Verify Node Selectors and Affinity Rules

Step 5: Check Persistent Volume Constraints

Step 6: Examine Scheduler Logs for Advanced Debugging

Step 7: Use kubectl debug to Simulate Scheduling

Architect's Pro Tip

Frequently Asked Questions

What's the difference between FailedScheduling and Pending status?

How do I temporarily force a pod to schedule on a specific node?

Can FailedScheduling be caused by pod security policies?

Related Kubernetes Guides

How to Fix Kubernetes 502 Bad Gateway from Ingress-NGINX (K8s 1.30+)

How to Fix Kubernetes 502 Bad Gateway with Istio Service Mesh (2025)

How to Fix Kubernetes 502 Bad Gateway in Ingress (K8s 1.31+)