Fix Azure AKS Upgrade Cordon Errors
When working with Azure AKS, you may encounter a configuration error that prevents your deployment from working. This guide explains the most common mistake with upgrade cordon and shows the exact fix.
A Common Mistake
Draining a node without respecting PodDisruptionBudgets, causing workload disruption during node maintenance.
The incorrect command:
az aks nodepool upgrade --cluster-name my-aks --resource-group my-rg --name pool1 --max-surge 1
Error output:
Node pool upgrade with default drain behavior.
Pods without PDBs are evicted immediately. Multiple pods of the same application may be evicted simultaneously. If the app has only 2 replicas and both are evicted, downtime occurs.
The Correct Approach
The right way to configure upgrade cordon in Azure AKS:
az aks nodepool upgrade --cluster-name my-aks --resource-group my-rg --name pool1 --max-surge 1 --drain-timeout 30 --node-vm-size Standard_D4s_v3
# Ensure PDBs are configured for all critical workloads
Successful result:
Upgrade with PDB-aware draining.
kubectl get pdb
NAME MIN AVAILABLE CURRENT ALLOWED
my-app-pdb 2 3
Drain evicts one pod at a time respecting the PDB. At least 2 pods remain available throughout the upgrade.
How to Prevent This
Create PDBs for all production workloads. Set max-surge to control how many extra nodes are created during upgrade (1-3). Set drain-timeout for long-running workloads. Monitor pod eviction events with kubectl get events. Test upgrade Process in staging.
FAQ
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro. Secure your cloud with DodaTech.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro