AWS EKS: Fix Intermittent Pod Evictions due to Resource Exhaustion in Multi-Tenant Clusters
Quick Fix Summary
TL;DRScale up the affected node group and cordon/drain the impacted nodes.
Intermittent pod evictions in multi-tenant EKS clusters are typically caused by resource contention (memory/CPU) at the node level, often due to misconfigured resource requests/limits, noisy neighbors, or insufficient node capacity.
Diagnosis & Causes
Recovery Steps
Step 1: Diagnose the Root Cause
Identify evicted pods and check node resource pressure. Determine if it's OOM (memory) or CPU throttling.
kubectl get pods --all-namespaces --field-selector=status.phase=Failed -o wide
kubectl describe node <node-name> | grep -A 10 -B 5 "MemoryPressure\|DiskPressure\|PIDPressure"
kubectl top nodes
kubectl top pods --all-namespaces --containers Step 2: Check Kubelet Logs on Affected Node
SSH into the problematic worker node (if using managed node groups, use SSM Session Manager) and examine kubelet logs for eviction messages.
aws ssm start-session --target <ec2-instance-id>
sudo journalctl -u kubelet --since "1 hour ago" | grep -i evict Step 3: Analyze Pod Resource Configuration
Review resource requests and limits for pods on the affected node. Look for pods with no limits or unrealistic requests.
kubectl describe pod <pod-name> -n <namespace> | grep -A 5 -B 5 "Limits\|Requests"
kubectl get pods -n <namespace> -o jsonpath="{range .items[*]}{.metadata.name}{'\t'}{.spec.containers[*].resources}{'\n'}{end}" Step 4: Immediate Mitigation - Cordon and Drain Node
Prevent new pods from scheduling onto the pressured node and safely evict existing workloads.
kubectl cordon <node-name>
kubectl drain <node-name> --ignore-daemonsets --delete-emptydir-data Step 5: Scale Node Group for Immediate Capacity
Increase the desired count of your Amazon EKS managed node group to add healthy nodes.
aws eks update-nodegroup-config --cluster-name <cluster-name> --nodegroup-name <nodegroup-name> --scaling-config desiredSize=<new-size> Step 6: Implement Resource Quotas and LimitRanges
Enforce resource constraints at the namespace level to prevent any single tenant from consuming all node resources.
kubectl create quota <quota-name> --hard=requests.cpu=2,requests.memory=4Gi,limits.cpu=4,limits.memory=8Gi -n <namespace>
kubectl create limitrange <limitrange-name> --default '{"cpu": "500m", "memory": "1Gi"}' --default-request '{"cpu": "200m", "memory": "512Mi"}' --max '{"cpu": "2", "memory": "4Gi"}' -n <namespace> Step 7: Configure Pod Disruption Budgets (PDBs)
Protect critical applications from excessive disruption during node drain operations.
kubectl apply -f - <<EOF
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: <pdb-name>
spec:
minAvailable: 1
selector:
matchLabels:
app: <your-app-label>
EOF Step 8: Enable and Review Cluster Autoscaler Logs
Ensure the Cluster Autoscaler is functioning correctly and scaling proactively, not reactively.
kubectl logs -f deployment/cluster-autoscaler -n kube-system | grep -E "ScaleUp\|NotTriggerScaleUp" Step 9: Implement Vertical Pod Autoscaler (VPA)
For dynamic workloads, use VPA to automatically adjust pod requests/limits based on historical consumption (use in recommendation mode for production).
kubectl apply -f https://github.com/kubernetes/autoscaler/raw/master/vertical-pod-autoscaler/deploy/vpa-v1-crd.yaml
kubectl apply -f https://github.com/kubernetes/autoscaler/raw/master/vertical-pod-autoscaler/deploy/vpa-recommender.yaml Architect's Pro Tip
"The most common cause is not setting memory limits. A pod without a memory limit can consume all node memory, triggering the kubelet to evict other pods. Always set limits, and ensure 'requests' are based on the 95th percentile of actual usage, not peak."
Frequently Asked Questions
Should I always set CPU/Memory limits?
Yes, especially in multi-tenant clusters. Limits prevent a single pod from starving others. For memory, they are critical to prevent node-level OOM events. For CPU, they prevent throttling but can be set higher than requests for burstable workloads.
Why does draining a node sometimes fail?
Draining fails if pods are not backed by a controller (e.g., bare Pods), have a PodDisruptionBudget that cannot be satisfied, or use local storage (emptyDir) without the `--delete-emptydir-data` flag. Always check PDBs and pod ownership first.
Cluster Autoscaler isn't scaling out fast enough during a spike. What can I do?
The autoscaler acts on unschedulable pods, which occurs after an eviction. To be proactive, configure larger 'reserved' capacity via over-provisioning pods or use Karpenter for faster, more responsive node provisioning based on predicted demand.