Kubernetes CRDs Guide — Custom Resource Definitions
In this tutorial, you'll learn about Kubernetes CRDs Guide. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.
Custom Resource Definitions (CRDs) extend the Kubernetes API by letting you define your own resource types, which Kubernetes treats as first-class citizens with full API server support.
What You'll Learn
You'll master CRDs — defining custom resources with OpenAPI v3 schemas, validation, subresources (status/scale), versioning with conversion, and printing columns for kubectl output.
Why This Problem Matters
Kubernetes ships with built-in resources (Pods, Services, Deployments), but every application has unique configuration needs. CRDs let you model your application's domain concepts as Kubernetes-native resources with validation, defaulting, and API conventions.
Real-World Use
Doda Browser uses CRDs to define browser extension configurations, malware signature update policies, and antivirus scan schedules — all managed through kubectl with full validation and status reporting.
CRD Structure
flowchart TB
subgraph CRDDefinition
Group[apiGroup: dodatech.io]
Version[v1, v2beta1]
Kind[MalwareScan]
Scope[Namespaced]
end
subgraph OpenAPIv3Schema
Spec[spec]
Status[status]
Props[properties]
Valid[validation rules]
end
subgraph APIBehavior
Create[POST /apis/dodatech.io/v1/...]
List[GET ...]
Watch[Watch]
StatusSub[Status Subresource]
end
CRDDefinition --> OpenAPIv3Schema
CRDDefinition --> APIBehavior
Basic CRD
# crd-backup.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: backups.storage.dodatech.io
spec:
group: storage.dodatech.io
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
sourcePVC:
type: string
pattern: '^[a-z0-9]([a-z0-9-]*[a-z0-9])?$'
minLength: 1
maxLength: 253
schedule:
type: string
pattern: '^(\d+|\*)(/\d+)?(\s+(\d+|\*)(/\d+)?){4}$'
retention:
type: integer
minimum: 1
maximum: 365
default: 30
namespace:
type: string
status:
type: object
properties:
lastBackup:
type: string
format: date-time
nextBackup:
type: string
format: date-time
backupCount:
type: integer
conditions:
type: array
items:
type: object
properties:
type:
type: string
status:
type: string
enum: ["True", "False", "Unknown"]
additionalPrinterColumns:
- name: Schedule
type: string
jsonPath: .spec.schedule
- name: Retention
type: integer
jsonPath: .spec.retention
- name: Last Backup
type: date
jsonPath: .status.lastBackup
scope: Namespaced
names:
plural: backups
singular: backup
kind: Backup
shortNames:
- bk
kubectl apply -f crd-backup.yaml
kubectl get crd
kubectl get backups # Now available as a resource
Expected output:
NAME CREATED AT
backups.storage.dodatech.io 2026-06-24T10:00:00Z
No resources found in default namespace.
Creating a Custom Resource
# backup-instance.yaml
apiVersion: storage.dodatech.io/v1
kind: Backup
metadata:
name: postgres-prod-backup
spec:
sourcePVC: postgres-data
schedule: "0 2 * * *"
retention: 30
namespace: production
kubectl apply -f backup-instance.yaml
kubectl get backups
kubectl describe backup postgres-prod-backup
Expected output:
NAME SCHEDULE RETENTION LAST BACKUP AGE
postgres-prod-backup 0 2 * * * 30 <none> 10s
Advanced Validation
CRDs support OpenAPI v3 validation rules with cross-field validation:
# crd-validated.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: databases.dodatech.io
spec:
group: dodatech.io
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
type: object
properties:
engine:
type: string
enum: [postgres, mysql, mongodb]
version:
type: string
storage:
type: string
pattern: '^[0-9]+(Gi|Ti)$'
replicas:
type: integer
minimum: 1
maximum: 10
backup:
type: object
properties:
enabled:
type: boolean
schedule:
type: string
retention:
type: integer
minimum: 1
maximum: 365
required: [enabled]
required: [engine, version, storage]
x-kubernetes-validations:
- rule: "self.storage.startsWith('100') ? self.replicas >= 3 : true"
message: "Databases with 100+ Gi storage must have at least 3 replicas"
- rule: "self.engine == 'postgres' ? self.version.startsWith('16') || self.version.startsWith('15') : true"
message: "PostgreSQL must be version 15.x or 16.x"
CRD with Status Subresource
import kubernetes
from kubernetes import client, config
class BackupStatusUpdater:
def __init__(self):
config.load_incluster_config()
self.custom_api = client.CustomObjectsApi()
def update_status(self, namespace: str, name: str,
last_backup: str, backup_count: int, healthy: bool):
status_patch = {
"status": {
"lastBackup": last_backup,
"backupCount": backup_count,
"conditions": [{
"type": "Healthy",
"status": "True" if healthy else "False",
"lastTransitionTime": last_backup
}]
}
}
self.custom_api.patch_namespaced_custom_object_status(
"storage.dodatech.io", "v1", namespace,
"backups", name, status_patch
)
def get_backups_due(self, namespace: str = None) -> list:
field_selector = None
backups = self.custom_api.list_namespaced_custom_object(
"storage.dodatech.io", "v1", namespace or "default",
"backups"
)
return backups.get("items", [])
updater = BackupStatusUpdater()
backups = updater.get_backups_due("default")
print(f"Found {len(backups)} backups")
for bk in backups:
spec = bk.get("spec", {})
status = bk.get("status", {})
print(f" {bk['metadata']['name']}: "
f"schedule={spec.get('schedule')}, "
f"last={status.get('lastBackup', 'never')}")
Expected output:
Found 1 backups
postgres-prod-backup: schedule=0 2 * * *, last=never
Versioning and Conversion
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
name: databases.dodatech.io
spec:
versions:
- name: v1
served: true
storage: true
schema:
openAPIV3Schema:
type: object
properties:
spec:
properties:
engine: { type: string }
storageGB: { type: integer }
- name: v1beta1
served: true
storage: false
schema:
openAPIV3Schema:
type: object
properties:
spec:
properties:
engine: { type: string }
storage: { type: string }
# Needs a conversion webhook or no conversion
conversion:
strategy: Webhook
webhook:
conversionReviewVersions: ["v1", "v1beta1"]
clientConfig:
service:
name: crd-conversion
namespace: kube-system
path: /convert
CRD Operations in Python
import kubernetes
from kubernetes import client, config
class CRDClient:
def __init__(self):
config.load_incluster_config()
self.api = client.CustomObjectsApi()
def list_resources(self, group: str, version: str,
plural: str, namespace: str = None):
if namespace:
return self.api.list_namespaced_custom_object(
group, version, namespace, plural
)
return self.api.list_cluster_custom_object(
group, version, plural
)
def create_resource(self, group: str, version: str,
namespace: str, plural: str,
body: dict):
return self.api.create_namespaced_custom_object(
group, version, namespace, plural, body
)
def patch_resource(self, group: str, version: str,
namespace: str, plural: str,
name: str, patch: dict):
return self.api.patch_namespaced_custom_object(
group, version, namespace, plural, name, patch
)
def delete_resource(self, group: str, version: str,
namespace: str, plural: str,
name: str):
return self.api.delete_namespaced_custom_object(
group, version, namespace, plural, name
)
client = CRDClient()
backup_body = {
"apiVersion": "storage.dodatech.io/v1",
"kind": "Backup",
"metadata": {"name": "test-backup"},
"spec": {
"sourcePVC": "test-data",
"schedule": "0 3 * * *",
"retention": 14
}
}
result = client.create_resource(
"storage.dodatech.io", "v1", "default",
"backups", backup_body
)
print(f"Created: {result['metadata']['name']}")
Expected output:
Created: test-backup
Printing Columns
Make CRD output readable with kubectl:
additionalPrinterColumns:
- name: Engine
type: string
jsonPath: .spec.engine
- name: Version
type: string
jsonPath: .spec.version
- name: Replicas
type: integer
jsonPath: .spec.replicas
- name: Status
type: string
jsonPath: .status.conditions[0].status
- name: Age
type: date
jsonPath: .metadata.creationTimestamp
kubectl get databases
Expected output:
NAME ENGINE VERSION REPLICAS STATUS AGE
my-postgres postgres 16.2 3 True 2d
my-mysql mysql 8.4 2 False 1d
Common Mistakes
1. Not Using Status Subresource
Without subresources.status, any user can write to the status field, bypassing the operator. Enable the status subresource so only the operator controller writes status.
2. Missing Validation Rules
Without schema validation, users can create invalid CRs that the operator must handle defensively. Every field should have type, pattern, or enum constraints.
3. Schema Breaking Changes
Removing a field from the CRD schema causes existing CRs to fail validation. Always add a new API version instead. Use conversion Webhooks to translate between versions.
4. Not Setting Short Names
Without shortNames, users type kubectl get backups.storage.dodatech.io. Short names (bk) improve UX: kubectl get bk.
5. Forgetting Scope
If scope is Cluster, the CR is cluster-scoped and doesn't belong to a namespace. If scope is Namespaced, the CR must specify a namespace. Choose based on resource semantics.
6. No Pruning for Unknown Fields
By default, Kubernetes prunes unknown fields from CRs. If you need to preserve unknown fields (e.g., for a generic config CRD), set x-Kubernetes-preserve-unknown-fields: true.
7. Overly Complex Nested Schemas
Deeply nested OpenAPI schemas (5+ levels) become unreadable and hard to validate. Flatten where possible and use x-Kubernetes-validations for cross-field rules.
Practice Questions
1. What is the difference between a CRD and an aggregated API server?
CRDs extend the Kubernetes API without writing a separate API server — only OpenAPI schema is needed. Aggregated API servers (APIServices) run a separate HTTPS server, offer full control over storage and business logic, but require significantly more code to implement.
2. How does CRD validation work?
Each version's openAPIV3Schema defines the expected structure. The API server validates all create and update requests against this schema. Invalid requests are rejected before persistence.
3. What happens when a CRD is deleted?
The CRD definition is removed. All custom resources of that type that still exist in etcd may be permanently lost unless backed up. The garbage collector can cascade-delete all CRs if configured.
4. Why use subresources for status and scale?
Subresources separate user-facing data (spec) from system-managed data (status). This lets operators write status without users accidentally overwriting it. The scale subresource integrates with HPA.
5. Challenge: Design CRDs for a multi-tenant SaaS platform.
Each tenant has databases, caches, and worker queues. Design CRDs that model a Tenant, Database, Cache, and WorkerPool with validation rules ensuring:
- Tenant names are DNS-1123 labels
- Database engine must be postgres or mysql
- Cache memory must be a multiple of 256Mi
- Worker pool min < max replicas
Mini Project: CRD Validator Script
import json
import re
import jsonschema
from jsonschema import validate, ValidationError
class CRDValidator:
def __init__(self, crd_schema: dict):
self.schema = crd_schema
def validate_resource(self, resource: dict) -> list:
errors = []
spec = resource.get("spec", {})
for field, rules in self._get_spec_rules().items():
value = spec.get(field)
if rules.get("required") and value is None:
errors.append(f"{field}: required")
continue
if rules.get("pattern") and value:
if not re.match(rules["pattern"], str(value)):
errors.append(
f"{field}: must match {rules['pattern']} "
f"(got {value})"
)
if rules.get("minimum") is not None and value is not None:
if value < rules["minimum"]:
errors.append(
f"{field}: minimum {rules['minimum']} "
f"(got {value})"
)
if rules.get("maximum") is not None and value is not None:
if value > rules["maximum"]:
errors.append(
f"{field}: maximum {rules['maximum']} "
f"(got {value})"
)
if rules.get("enum") and value not in rules["enum"]:
errors.append(
f"{field}: must be one of {rules['enum']}"
)
return errors
def _get_spec_rules(self) -> dict:
try:
props = self.schema["spec"]["openAPIV3Schema"][
"properties"]["spec"]["properties]
]
except KeyError:
try:
props = self.schema["versions"][0][
"schema"]["openAPIV3Schema"]["properties"][
"spec"]["properties]
]
except KeyError:
return {}
return props
crd_schema = {
"spec": {
"openAPIV3Schema": {
"properties": {
"spec": {
"properties": {
"engine": {
"type": "string",
"enum": ["postgres", "mysql"]
},
"replicas": {
"type": "integer",
"minimum": 1,
"maximum": 10
},
"schedule": {
"type": "string",
"pattern": r"^(\d+|\*)(/\d+)?(\s+(\d+|\*)(/\d+)?){4}$"
}
}
}
}
}
}
}
validator = CRDValidator(crd_schema)
tests = [
{"spec": {"engine": "postgres", "replicas": 3, "schedule": "0 2 * * *"}},
{"spec": {"engine": "sqlite", "replicas": 3}},
{"spec": {"engine": "postgres", "replicas": 15}},
]
for t in tests:
errs = validator.validate_resource(t)
if errs:
print(f"FAIL: {t['spec']} -> {errs}")
else:
print(f"PASS: {t['spec']}")
Expected output:
PASS: {'engine': 'postgres', 'replicas': 3, 'schedule': '0 2 * * *'}
FAIL: {'engine': 'sqlite', 'replicas': 3} -> ['engine: must be one of [\"postgres\", \"mysql\"]']
FAIL: {'engine': 'postgres', 'replicas': 15} -> ['replicas: maximum 10 (got 15)']
FAQ
What's Next
Congratulations on completing this CRDs guide! Here's where to go from here:
- Practice daily — Define a CRD for a configuration that you manage manually
- Build a project — Write an operator that reconciles your CRD
- Explore related topics — Admission Webhooks, conversion Webhooks, CRD composition, API aggregation
- Join the community — Share your CRD designs and get feedback
Remember: every expert was once a beginner. Keep extending!
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro