Kubernetes Pod Scheduling Troubleshooting Guide: Node Affinity & Taints
Quick Fix Summary
TL;DRCheck node affinity rules, taint tolerations, and resource availability on worker nodes.
FailedScheduling occurs when the Kubernetes scheduler cannot find a suitable node for a pod. This is typically due to node selector mismatches, taint conflicts, or insufficient resources.
Diagnosis & Causes
Recovery Steps
Step 1: Diagnose with kubectl describe pod
Examine the pod's events to see the scheduler's specific rejection reason.
kubectl describe pod <pod-name> -n <namespace>
# Look for 'Events:' section with 'FailedScheduling' message Step 2: Check Node Affinity and Selector Conflicts
Verify pod's nodeSelector/affinity rules match actual node labels.
kubectl get pods <pod-name> -n <namespace> -o yaml | grep -A 10 -B 5 'nodeSelector\|affinity'
kubectl get nodes --show-labels Step 3: Inspect Node Taints and Pod Tolerations
Ensure pod has tolerations for any taints on candidate nodes.
kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
kubectl get pod <pod-name> -n <namespace> -o yaml | grep -A 5 -B 2 tolerations Step 4: Verify Node Resource Availability
Check if nodes have enough allocatable CPU/memory for pod requests.
kubectl describe nodes | grep -A 5 'Allocatable:'
kubectl get pod <pod-name> -n <namespace> -o yaml | grep -A 3 'resources:' Step 5: Examine Scheduler Logs for Detailed Reasoning
Access the kube-scheduler logs for granular scheduling decision details.
kubectl logs -n kube-system -l component=kube-scheduler --tail=50
# Filter for your pod's name or namespace Step 6: Use kubectl explain for Validation
Understand the exact schema for affinity and toleration fields.
kubectl explain pod.spec.affinity
kubectl explain pod.spec.tolerations Architect's Pro Tip
"For complex affinity rules, test with `kubectl apply --dry-run=server -f pod.yaml` and check scheduler logs immediately to see evaluation results."
Frequently Asked Questions
What's the difference between nodeSelector and nodeAffinity?
nodeSelector uses simple key-value matching, while nodeAffinity offers advanced operators (In, NotIn, Exists, DoesNotExist, Gt, Lt) and both requiredDuringScheduling and preferredDuringScheduling rules.
Can taints and tolerations work with node affinity?
Yes, they work together. Taints repel pods without matching tolerations first, then node affinity/selector rules are applied to the remaining candidate nodes.
How do I temporarily force a pod to schedule?
Use `kubectl taint nodes <node-name> key=value:NoSchedule-` to remove a taint, or add matching tolerations to the pod spec. Never use `nodeName` bypass in production.