Fix GCP GKE Node Taint Errors
When working with GCP GKE, you may encounter a configuration error that prevents your deployment from working. This guide explains the most common mistake with node taint and shows the exact fix.
A Common Mistake
Adding a taint to a node pool but not adding tolerations to pods, preventing any pods from being scheduled on the tainted nodes.
The incorrect command:
gcloud container node-pools create gpu-pool --cluster=my-cluster --zone=us-central1-a --machine-type=n1-standard-4 --accelerator=type=nvidia-tesla-t4,count=1 --node-taints=accelerator=nvidia-tesla-t4:NoSchedule
Error output:
Created tainted node pool.
Pods are not scheduled on these nodes:
kubectl get pods -o wide
All pods are on other node pools. The GPU nodes sit idle because no pods have tolerations for the taint `accelerator=nvidia-tesla-t4:NoSchedule`.
The Correct Approach
The right way to configure node taint in GCP GKE:
kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: gpu-job
spec:
tolerations:
- key: accelerator
operator: Equal
value: nvidia-tesla-t4
effect: NoSchedule
containers:
- name: gpu
image: nvidia/cuda:11.0-base
EOF
Successful result:
pod/gpu-job created
kubectl get pods -o wide
gpu-job is Running on the GPU node. The toleration matches the node taint, allowing the pod to use GPU resources.
How to Prevent This
Always add tolerations to pods that need to run on tainted nodes. Use kubectl describe node to see node taints. Taint effects: NoSchedule (prevent new pods), PreferNoSchedule (soft), NoExecute (evict existing). Use node affinity for positive selection combined with taints for exclusion.
FAQ
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro. Secure your cloud with DodaTech.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro