Kubernetes Security Contexts Guide — Pod and Container Security
In this tutorial, you'll learn about Kubernetes Security Contexts Guide. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.
Kubernetes Security Contexts define privilege and access control settings for pods and containers, including user IDs, Linux capabilities, SELinux labels, and seccomp profiles.
What You'll Learn
You'll master security contexts — configuring runAsUser and fsGroup, dropping and adding Linux capabilities, enabling seccomp and AppArmor profiles, preventing privilege escalation, and enforcing PodSecurity standards.
Why This Problem Matters
Containers run as root by default. If an attacker exploits a vulnerability in your application, they get root access to the container and potentially the host. Security contexts restrict what a container can do, limiting Blast Radius and preventing container breakout.
Real-World Use
Durga Antivirus Pro runs scanning containers with strict security contexts: read-only root filesystem, non-root user (UID 1001), all capabilities dropped except NET_BIND_SERVICE, and a seccomp profile that blocks 80% of system calls.
Security Context Levels
flowchart TB Pod[Pod Security Context] --> PSC[Pod-level settings] Pod --> CSC[Container-level settings
Overrides pod-level] subgraph PodSettings PU[runAsUser] PG[runAsGroup] FS[fsGroup] SEL[SELinux] NS[Namespace options] end subgraph ContainerSettings CU[runAsUser] CG[runAsGroup] CAP[Capabilities
Add/Drop] PE[privileged] ES[allowPrivilegeEscalation] RO[readOnlyRootFilesystem] SEC[seccompProfile] APP[AppArmor] end Pod --> PodSettings Pod --> ContainerSettings
Non-Root User
apiVersion: v1
kind: Pod
metadata:
name: non-root-pod
spec:
securityContext:
runAsUser: 1001
runAsGroup: 3001
fsGroup: 2001
containers:
- name: app
image: nginx:alpine
securityContext:
runAsUser: 1001
runAsGroup: 3001
allowPrivilegeEscalation: false
ports:
- containerPort: 80
kubectl apply -f non-root-pod.yaml
kubectl exec non-root-pod -- whoami
kubectl exec non-root-pod -- id
Expected output:
whoami: cannot find name for user ID 1001
uid=1001(1001) gid=3001 groups=3001
Linux Capabilities
apiVersion: apps/v1
kind: Deployment
metadata:
name: minimal-capabilities
spec:
template:
spec:
containers:
- name: nginx
image: nginx:alpine
securityContext:
capabilities:
drop:
- ALL
add:
- NET_BIND_SERVICE
- CHOWN
- SETGID
- SETUID
ports:
- containerPort: 80
import subprocess
import json
def check_container_capabilities(pod_name: str, namespace: str = "default"):
cmd = [
"kubectl", "exec", pod_name, "-n", namespace, "--",
"cat", "/proc/1/status]
]
result = subprocess.run(cmd, capture_output=True, text=True)
for line in result.stdout.split("\n"):
if line.startswith("CapBnd"):
cap_mask = line.split("\t")[1]
return parse_capabilities(int(cap_mask, 16))
return []
def parse_capabilities(mask: int) -> list:
caps = {
0: "CHOWN", 1: "DAC_OVERRIDE", 2: "DAC_READ_SEARCH",
3: "FOWNER", 4: "FSETID", 5: "KILL", 6: "SETGID",
7: "SETUID", 8: "SETPCAP", 9: "LINUX_IMMUTABLE",
10: "NET_BIND_SERVICE", 11: "NET_BROADCAST",
12: "NET_ADMIN", 13: "NET_RAW",
}
active = []
for i, name in caps.items():
if mask & (1 << i):
active.append(name)
return active
print("Running as root with all capabilities...")
all_caps = parse_capabilities(0xFFFFFFFFFFFFFFFF)
print(f" {len(all_caps)} capabilities: {', '.join(all_caps[:5])}...")
print("Running as non-root with NET_BIND_SERVICE only...")
min_caps = parse_capabilities((1 << 10))
print(f" {len(min_caps)} capabilities: {', '.join(min_caps)}")
Expected output:
Running as root with all capabilities...
14 capabilities: CHOWN, DAC_OVERRIDE, DAC_READ_SEARCH, FOWNER, FSETID...
Running as non-root with NET_BIND_SERVICE only...
1 capabilities: NET_BIND_SERVICE
Read-Only Root Filesystem
apiVersion: v1
kind: Pod
metadata:
name: read-only-pod
spec:
containers:
- name: app
image: nginx:alpine
securityContext:
readOnlyRootFilesystem: true
volumeMounts:
- name: tmp
mountPath: /tmp
- name: nginx-run
mountPath: /var/run
- name: nginx-cache
mountPath: /var/cache/nginx
volumes:
- name: tmp
emptyDir: {}
- name: nginx-run
emptyDir: {}
- name: nginx-cache
emptyDir: {}
kubectl exec read-only-pod -- touch /test.txt
Expected output:
touch: /test.txt: Read-only file system
Seccomp Profile
apiVersion: v1
kind: Pod
metadata:
name: seccomp-pod
spec:
securityContext:
seccompProfile:
type: Localhost
localhostProfile: profiles/audit.json
containers:
- name: app
image: nginx:alpine
securityContext:
seccompProfile:
type: RuntimeDefault # Uses container runtime's default
{
"defaultAction": "SCMP_ACT_ERRNO",
"architectures": ["SCMP_ARCH_X86_64"],
"syscalls": [
{
"names": ["read", "write", "open", "close", "stat",
"mmap", "munmap", "brk", "exit_group"],
"action": "SCMP_ACT_ALLOW"
}
]
}
AppArmor Profile
apiVersion: v1
kind: Pod
metadata:
name: apparmor-pod
annotations:
container.apparmor.security.beta.kubernetes.io/app: localhost/k8s-apparmor-profile
spec:
containers:
- name: app
image: nginx:alpine
PodSecurity Admission (PSA)
# Enable PodSecurity admission
# Add to kube-apiserver args:
# --admission=MutatingAdmissionWebhook,ValidatingAdmissionWebhook,PodSecurity
# Label namespace with security level
kubectl label namespace production pod-security.kubernetes.io/enforce=restricted
# Check pod security violations
kubectl label namespace --all pod-security.kubernetes.io/enforce=baseline
# Pod Security Standards
# Privileged: unrestricted (default)
# Baseline: minimally restrictive (prevents known escalations)
# Restricted: heavily restrictive (follows pod hardening best practices)
apiVersion: v1
kind: Namespace
metadata:
name: production
labels:
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/audit: baseline
pod-security.kubernetes.io/warn: baseline
Security Context Validation
import yaml
import re
class SecurityContextValidator:
def __init__(self):
self.warnings = []
self.errors = []
def validate_pod(self, pod_manifest: dict):
spec = pod_manifest.get("spec", {})
# Check pod-level security context
sc = spec.get("securityContext", {})
run_as = sc.get("runAsUser", 0)
if run_as == 0:
self.errors.append("Pod runs as root (runAsUser: 0)")
# Check containers
for container in spec.get("containers", []):
csc = container.get("securityContext", {})
if csc.get("privileged", False):
self.errors.append(
f"Container {container['name']} is privileged"
)
if not csc.get("allowPrivilegeEscalation", True) is False:
self.warnings.append(
f"Container {container['name']}: "
"allowPrivilegeEscalation not set to false"
)
caps = csc.get("capabilities", {})
dropped = caps.get("drop", [])
if "ALL" not in dropped:
self.warnings.append(
f"Container {container['name']}: "
"not dropping ALL capabilities"
)
if not csc.get("readOnlyRootFilesystem", False):
self.warnings.append(
f"Container {container['name']}: "
"readOnlyRootFilesystem not set"
)
return {
"pass": len(self.errors) == 0,
"errors": self.errors,
"warnings": self.warnings
}
validator = SecurityContextValidator()
bad_pod = {
"apiVersion": "v1",
"kind": "Pod",
"spec": {
"securityContext": {"runAsUser": 0},
"containers": [{
"name": "web",
"image": "nginx",
"securityContext": {
"privileged": True
}
}]
}
}
good_pod = {
"apiVersion": "v1",
"kind": "Pod",
"spec": {
"securityContext": {"runAsUser": 1001},
"containers": [{
"name": "web",
"image": "nginx",
"securityContext": {
"runAsNonRoot": True,
"allowPrivilegeEscalation": False,
"capabilities": {"drop": ["ALL"], "add": ["NET_BIND_SERVICE"]},
"readOnlyRootFilesystem": True,
"seccompProfile": {"type": "RuntimeDefault"}
}
}]
}
}
print("Bad pod validation:")
result = validator.validate_pod(bad_pod)
for err in result["errors"]:
print(f" ERROR: {err}")
for warn in result["warnings"]:
print(f" WARN: {warn}")
print("\nGood pod validation:")
result = validator.validate_pod(good_pod)
for err in result["errors"]:
print(f" ERROR: {err}")
for warn in result["warnings"]:
print(f" WARN: {warn}")
Expected output:
Bad pod validation:
ERROR: Pod runs as root (runAsUser: 0)
WARN: Container web: allowPrivilegeEscalation not set to false
WARN: Container web: not dropping ALL capabilities
WARN: Container web: readOnlyRootFilesystem not set
Good pod validation:
(no errors or warnings)
Common Mistakes
1. Setting runAsUser Without runAsGroup
Setting only runAsUser leaves the group as root. Files created by the container are owned by the user but the group remains root (0). Always set runAsGroup and fsGroup together.
2. Privileged Container for Debugging
Running a container as privileged (privileged: true) gives it access to all host devices and capabilities. Use kubectl debug or ephemeral containers for debugging instead.
3. Not Dropping All Capabilities First
capabilities.add: ["NET_ADMIN"] without dropping ALL first keeps all default capabilities plus NET_ADMIN. Always drop: ["ALL"] first, then add only what's needed.
4. Forgetting fsGroup for Volume Access
When using persistentVolumeClaim, the container may not have write permission to the mounted volume. Set fsGroup to the group that owns the volume's mount path.
5. Seccomp Profile Too Restrictive
A custom seccomp profile may block system calls required by the container runtime or application. Start with RuntimeDefault seccomp profile (included with containerd/cri-o) and only customize if needed.
6. No PodSecurity Admission
Without PSA, there's no enforcement of security standards across namespaces. Developers can accidentally deploy privileged pods. Enforce at least baseline level in production namespaces.
7. Running Init Containers as Root
Init containers often run as root to set up permissions. Switch to non-root after setup: ensure the main container runs as non-root even if the init container requires root.
Practice Questions
1. What is the difference between pod-level and container-level security contexts?
Pod-level security context applies to all containers in the pod and to volume permissions (fsGroup). Container-level overrides the pod-level for specific containers. Use pod-level for shared settings (fsGroup, SELinux) and container-level for per-container settings (capabilities).
2. Why drop ALL capabilities and add only what's needed?
By default, containers get 14 capabilities. Most applications need only 1-2: NET_BIND_SERVICE to bind to ports < 1024, and CHOWN/SETGID/SETUID for file permissions. Dropping unnecessary capabilities reduces attack surface.
3. What does readOnlyRootFilesystem protect against?
It prevents the application from writing to its own filesystem, which blocks common attack vectors like writing malicious scripts to /tmp or modifying binaries. Writable directories are explicitly mounted as emptyDir volumes.
4. How does allowPrivilegeEscalation=false affect a container?
It prevents the container Process from gaining more privileges than its parent. This blocks attacks that use suid binaries or setuid syscalls to escalate from non-root to root within the container.
5. Challenge: Design security contexts for a CI/CD runner pod.
A CI/CD runner needs to build Docker images (requires /var/run/docker.sock), clone git repositories, install packages (apt), and run tests. Design the minimum set of capabilities and security context settings needed, explaining the security trade-offs.
Mini Project: Pod Security Linter
import yaml
import os
import glob
class PodSecurityLinter:
def __init__(self):
self.findings = []
def lint_file(self, filepath: str):
with open(filepath) as f:
docs = list(yaml.safe_load_all(f))
for doc in docs:
if not doc or doc.get("kind") not in ("Pod", "Deployment",
"StatefulSet", "DaemonSet"):
continue
self._lint_pod_spec(doc.get("spec", {}), filepath)
def _lint_pod_spec(self, spec: dict, filepath: str):
# Check template spec for Deployment/StatefulSet/DaemonSet
template_spec = spec.get("template", {}).get("spec", spec)
sc = template_spec.get("securityContext", {})
if sc.get("runAsUser") == 0:
self.findings.append({
"file": filepath,
"severity": "ERROR",
"message": "Pod/container runs as root"
})
for container in template_spec.get("containers", []):
csc = container.get("securityContext", {})
if csc.get("privileged"):
self.findings.append({
"file": filepath,
"severity": "ERROR",
"message": f"Container {container['name']} is privileged"
})
if csc.get("allowPrivilegeEscalation") is not False:
self.findings.append({
"file": filepath,
"severity": "WARN",
"message": f"Container {container['name']}: "
"allowPrivilegeEscalation not false"
})
def report(self):
print(f"{'Severity':>8} | {'File':<35} | Message")
print("-" * 85)
for f in self.findings:
print(f"{f['severity']:>8} | {f['file']:<35} | {f['message']}")
linter = PodSecurityLinter()
test_manifests = [
{"apiVersion": "v1", "kind": "Pod",
"spec": {"securityContext": {"runAsUser": 0},
"containers": [{"name": "web", "image": "nginx",
"securityContext": {"privileged": True}}]}},
{"apiVersion": "v1", "kind": "Pod",
"spec": {"securityContext": {"runAsUser": 1001},
"containers": [{"name": "web", "image": "nginx",
"securityContext": {
"allowPrivilegeEscalation": False,
"capabilities": {"drop": ["ALL"]},
"readOnlyRootFilesystem": True
}}]}},
]
for i, manifest in enumerate(test_manifests):
path = f"test-manifest-{i}.yaml"
with open(path, "w") as f:
yaml.dump(manifest, f)
linter.lint_file(path)
os.remove(path)
linter.report()
Expected output:
Severity | File | Message
-------------------------------------------------------------------------------------
ERROR | test-manifest-0.yaml | Pod/container runs as root
ERROR | test-manifest-0.yaml | Container web is privileged
WARN | test-manifest-1.yaml | Container web: allowPrivilegeEscalation not false
FAQ
What's Next
Congratulations on completing this security contexts guide! Here's where to go from here:
- Practice daily — Add security contexts to every pod you run
- Build a project — Set up PodSecurity admission and fix violations across namespaces
- Explore related topics — SELinux policies, seccomp profile generation, Pod Security Standards, OPA Gatekeeper
- Join the community — Share your security configurations and get feedback
Remember: every expert was once a beginner. Keep securing!
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro