Node Tuning & Performance Optimization
In this tutorial, you'll learn about Node Tuning & Performance Optimization. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.
Node tuning optimizes Kubernetes worker nodes for specific workloads through CPU pinning, huge pages, NUMA topology awareness, and kernel parameter tuning for maximum performance.
What You'll Learn
This tutorial covers CPU Manager policies for pinning, huge pages allocation, Topology Manager for NUMA alignment, Node Tuning Operator, and kernel parameter optimization.
Why It Matters
Untuned nodes leave 30 to 50 percent performance on the table. For latency-sensitive applications like financial trading or real-time video processing, tuning is essential for consistent performance.
Real-World Use
Bloomberg tunes Kubernetes nodes with CPU pinning and huge pages for their financial data platform, achieving sub-millisecond processing latency. Netflix uses tuned nodes for video encoding workloads.
CPU Manager Policies
The CPU Manager controls CPU affinity for containers.
Static Policy
The static policy pins containers to specific CPU cores.
# kubelet configuration
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
cpuManagerPolicy: static
cpuManagerReconcilePeriod: 5s
reservedSystemCPUs: "0,1"
kubeReserved:
cpu: "2"
# Verify CPU manager policy on a node
kubectl describe node worker-1 | grep cpu-manager
# Check CPU assignments for a pod
kubectl exec pod-name -- cat /sys/fs/cgroup/cpuset/cpuset.cpus
Guaranteed QoS for CPU Pinning
Pods must use Guaranteed QoS class for CPU pinning.
apiVersion: v1
kind: Pod
metadata:
name: latency-critical
spec:
containers:
- name: app
image: myapp:1.0
resources:
requests:
cpu: "2"
memory: "1Gi"
limits:
cpu: "2"
memory: "1Gi"
When requests equal limits and are integer values, the pod receives dedicated CPU cores.
Huge Pages
Huge pages reduce TLB misses for memory-intensive workloads.
Configuring Huge Pages on Nodes
# Configure 2MB huge pages
echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
# Configure 1GB huge pages
echo 4 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
# Make persistent across reboots
echo 'vm.nr_hugepages=1024' >> /etc/sysctl.conf
Requesting Huge Pages in Pods
apiVersion: v1
kind: Pod
metadata:
name: huge-page-app
spec:
containers:
- name: app
image: myapp:1.0
resources:
requests:
memory: "1Gi"
hugepages-2Mi: "512Mi"
limits:
hugepages-2Mi: "512Mi"
volumeMounts:
- mountPath: /hugepages
name: hugepage
volumes:
- name: hugepage
emptyDir:
medium: HugePages
Topology Manager
The Topology Manager aligns CPU, memory, and device assignments to NUMA nodes.
# kubelet configuration
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
topologyManagerPolicy: single-numa-node
topologyManagerScope: container
The single-numa-node policy ensures all resources come from one NUMA node, reducing cross-NUMA memory access latency.
Node Tuning Operator
The Node Tuning Operator manages node-level sysctls and kernel settings.
apiVersion: tuned.openshift.io/v1
kind: Tuned
metadata:
name: latency-performance
namespace: openshift-cluster-node-tuning-operator
spec:
profile:
- data: |
[main]
summary=Optimize for latency performance
include=latency-performance
[sysctl]
vm.dirty_ratio=10
vm.dirty_background_ratio=5
vm.swappiness=10
kernel.numa_balancing=0
net.core.busy_read=50
net.core.busy_poll=50
name: latency-performance
recommend:
- match:
- label: tuned.openshift.io/latency
value: "true"
priority: 20
profile: latency-performance
Node-level Tuning Commands
# Check NUMA topology
lscpu | grep -i numa
# View CPU frequency governor
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
# Set performance governor
echo performance | tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor
# Check huge page allocation
cat /proc/meminfo | grep -i huge
Practice Questions
What does the static CPU Manager policy do? It pins Guaranteed QoS containers to specific CPU cores, preventing context switching.
Why use huge pages for databases or data processing? Large page sizes reduce TLB misses, improving memory access performance for large working sets.
What is the purpose of the Topology Manager? It aligns CPU, memory, and device allocations to the same NUMA node for optimal performance.
How do you verify CPU pinning is working for a pod? Check /sys/fs/cgroup/cpuset/cpuset.cpus inside the container or describe the node.
What kernel parameter reduces swap usage on Kubernetes nodes? vm.swappiness=10 reduces the kernel tendency to swap under memory pressure.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro