ERROR

Kubernetes Pod Scheduling Troubleshooting Guide: Node Affinity & Taints

Quick Fix Summary

TL;DR

Check node affinity rules, taint tolerations, and resource availability on worker nodes.

FailedScheduling occurs when the Kubernetes scheduler cannot find a suitable node for a pod. This is typically due to node selector mismatches, taint conflicts, or insufficient resources.

Diagnosis & Causes

  • Node affinity/selector rules don't match any node labels.
  • Pod lacks tolerations for node taints.
  • Insufficient CPU or memory on available nodes.
  • NodeSelector or nodeName specifies a non-existent node.
  • All nodes have PodDisruptionBudget restrictions.
  • Recovery Steps

    1

    Step 1: Diagnose with kubectl describe pod

    Examine the pod's events to see the scheduler's specific rejection reason.

    bash
    kubectl describe pod <pod-name> -n <namespace>
    # Look for 'Events:' section with 'FailedScheduling' message
    2

    Step 2: Check Node Affinity and Selector Conflicts

    Verify pod's nodeSelector/affinity rules match actual node labels.

    bash
    kubectl get pods <pod-name> -n <namespace> -o yaml | grep -A 10 -B 5 'nodeSelector\|affinity'
    kubectl get nodes --show-labels
    3

    Step 3: Inspect Node Taints and Pod Tolerations

    Ensure pod has tolerations for any taints on candidate nodes.

    bash
    kubectl get nodes -o custom-columns=NAME:.metadata.name,TAINTS:.spec.taints
    kubectl get pod <pod-name> -n <namespace> -o yaml | grep -A 5 -B 2 tolerations
    4

    Step 4: Verify Node Resource Availability

    Check if nodes have enough allocatable CPU/memory for pod requests.

    bash
    kubectl describe nodes | grep -A 5 'Allocatable:'
    kubectl get pod <pod-name> -n <namespace> -o yaml | grep -A 3 'resources:'
    5

    Step 5: Examine Scheduler Logs for Detailed Reasoning

    Access the kube-scheduler logs for granular scheduling decision details.

    bash
    kubectl logs -n kube-system -l component=kube-scheduler --tail=50
    # Filter for your pod's name or namespace
    6

    Step 6: Use kubectl explain for Validation

    Understand the exact schema for affinity and toleration fields.

    bash
    kubectl explain pod.spec.affinity
    kubectl explain pod.spec.tolerations

    Architect's Pro Tip

    "For complex affinity rules, test with `kubectl apply --dry-run=server -f pod.yaml` and check scheduler logs immediately to see evaluation results."

    Frequently Asked Questions

    What's the difference between nodeSelector and nodeAffinity?

    nodeSelector uses simple key-value matching, while nodeAffinity offers advanced operators (In, NotIn, Exists, DoesNotExist, Gt, Lt) and both requiredDuringScheduling and preferredDuringScheduling rules.

    Can taints and tolerations work with node affinity?

    Yes, they work together. Taints repel pods without matching tolerations first, then node affinity/selector rules are applied to the remaining candidate nodes.

    How do I temporarily force a pod to schedule?

    Use `kubectl taint nodes <node-name> key=value:NoSchedule-` to remove a taint, or add matching tolerations to the pod spec. Never use `nodeName` bypass in production.

    Related Kubernetes Guides