Skip to content

Fix Azure AKS Upgrade Cordon Errors

DodaTech Updated 2026-06-26 2 min read

When working with Azure AKS, you may encounter a configuration error that prevents your deployment from working. This guide explains the most common mistake with upgrade cordon and shows the exact fix.

A Common Mistake

Draining a node without respecting PodDisruptionBudgets, causing workload disruption during node maintenance.

The incorrect command:

az aks nodepool upgrade --cluster-name my-aks --resource-group my-rg --name pool1 --max-surge 1

Error output:

Node pool upgrade with default drain behavior.
Pods without PDBs are evicted immediately. Multiple pods of the same application may be evicted simultaneously. If the app has only 2 replicas and both are evicted, downtime occurs.

The Correct Approach

The right way to configure upgrade cordon in Azure AKS:

az aks nodepool upgrade --cluster-name my-aks --resource-group my-rg --name pool1 --max-surge 1 --drain-timeout 30 --node-vm-size Standard_D4s_v3
# Ensure PDBs are configured for all critical workloads

Successful result:

Upgrade with PDB-aware draining.
 kubectl get pdb
NAME         MIN AVAILABLE   CURRENT ALLOWED
my-app-pdb   2               3
Drain evicts one pod at a time respecting the PDB. At least 2 pods remain available throughout the upgrade.

How to Prevent This

Create PDBs for all production workloads. Set max-surge to control how many extra nodes are created during upgrade (1-3). Set drain-timeout for long-running workloads. Monitor pod eviction events with kubectl get events. Test upgrade Process in staging.

FAQ

Why does my upgrade cordon configuration fail in Azure AKS?

Configuration failures in Azure often stem from missing role assignments, incorrect resource IDs, region availability issues, or ARM template parameter errors. Always use az --help to verify command syntax and parameter names. Check Azure Activity Log for detailed error traces.

How do I debug upgrade cordon issues in Azure?

Use az monitor activity-log list to audit operations. For resource issues, use az resource show. For networking, use Network Watcher diagnostics. For role issues, check az role assignment list. Enable diagnostic settings for detailed logging. Use az rest to call Azure REST APIs directly for debugging.

What are the best practices for upgrade cordon in Azure?

Use infrastructure-as-code (ARM, Terraform, Bicep) for all configurations. Tag resources for cost tracking and management. Use Azure Policy for governance. Enable diagnostic logs and monitoring. Follow Least Privilege for RBAC. Test in a non-production environment first. Review Azure Advisor recommendations regularly.


Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro. Secure your cloud with DodaTech.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro