Skip to content

Kubernetes CRDs Guide — Custom Resource Definitions

DodaTech Updated 2026-06-24 9 min read

In this tutorial, you'll learn about Kubernetes CRDs Guide. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

Custom Resource Definitions (CRDs) extend the Kubernetes API by letting you define your own resource types, which Kubernetes treats as first-class citizens with full API server support.

What You'll Learn

You'll master CRDs — defining custom resources with OpenAPI v3 schemas, validation, subresources (status/scale), versioning with conversion, and printing columns for kubectl output.

Why This Problem Matters

Kubernetes ships with built-in resources (Pods, Services, Deployments), but every application has unique configuration needs. CRDs let you model your application's domain concepts as Kubernetes-native resources with validation, defaulting, and API conventions.

Real-World Use

Doda Browser uses CRDs to define browser extension configurations, malware signature update policies, and antivirus scan schedules — all managed through kubectl with full validation and status reporting.

CRD Structure

flowchart TB
  subgraph CRDDefinition
    Group[apiGroup: dodatech.io]
    Version[v1, v2beta1]
    Kind[MalwareScan]
    Scope[Namespaced]
  end
  subgraph OpenAPIv3Schema
    Spec[spec]
    Status[status]
    Props[properties]
    Valid[validation rules]
  end
  subgraph APIBehavior
    Create[POST /apis/dodatech.io/v1/...]
    List[GET ...]
    Watch[Watch]
    StatusSub[Status Subresource]
  end
  CRDDefinition --> OpenAPIv3Schema
  CRDDefinition --> APIBehavior

Basic CRD

# crd-backup.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: backups.storage.dodatech.io
spec:
  group: storage.dodatech.io
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                sourcePVC:
                  type: string
                  pattern: '^[a-z0-9]([a-z0-9-]*[a-z0-9])?$'
                  minLength: 1
                  maxLength: 253
                schedule:
                  type: string
                  pattern: '^(\d+|\*)(/\d+)?(\s+(\d+|\*)(/\d+)?){4}$'
                retention:
                  type: integer
                  minimum: 1
                  maximum: 365
                  default: 30
                namespace:
                  type: string
            status:
              type: object
              properties:
                lastBackup:
                  type: string
                  format: date-time
                nextBackup:
                  type: string
                  format: date-time
                backupCount:
                  type: integer
                conditions:
                  type: array
                  items:
                    type: object
                    properties:
                      type:
                        type: string
                      status:
                        type: string
                        enum: ["True", "False", "Unknown"]
      additionalPrinterColumns:
        - name: Schedule
          type: string
          jsonPath: .spec.schedule
        - name: Retention
          type: integer
          jsonPath: .spec.retention
        - name: Last Backup
          type: date
          jsonPath: .status.lastBackup
  scope: Namespaced
  names:
    plural: backups
    singular: backup
    kind: Backup
    shortNames:
      - bk
kubectl apply -f crd-backup.yaml
kubectl get crd
kubectl get backups  # Now available as a resource

Expected output:

NAME                              CREATED AT
backups.storage.dodatech.io       2026-06-24T10:00:00Z
No resources found in default namespace.

Creating a Custom Resource

# backup-instance.yaml
apiVersion: storage.dodatech.io/v1
kind: Backup
metadata:
  name: postgres-prod-backup
spec:
  sourcePVC: postgres-data
  schedule: "0 2 * * *"
  retention: 30
  namespace: production
kubectl apply -f backup-instance.yaml
kubectl get backups
kubectl describe backup postgres-prod-backup

Expected output:

NAME                  SCHEDULE     RETENTION   LAST BACKUP   AGE
postgres-prod-backup   0 2 * * *   30          <none>        10s

Advanced Validation

CRDs support OpenAPI v3 validation rules with cross-field validation:

# crd-validated.yaml
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.dodatech.io
spec:
  group: dodatech.io
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              type: object
              properties:
                engine:
                  type: string
                  enum: [postgres, mysql, mongodb]
                version:
                  type: string
                storage:
                  type: string
                  pattern: '^[0-9]+(Gi|Ti)$'
                replicas:
                  type: integer
                  minimum: 1
                  maximum: 10
                backup:
                  type: object
                  properties:
                    enabled:
                      type: boolean
                    schedule:
                      type: string
                    retention:
                      type: integer
                      minimum: 1
                      maximum: 365
                  required: [enabled]
              required: [engine, version, storage]
              x-kubernetes-validations:
                - rule: "self.storage.startsWith('100') ? self.replicas >= 3 : true"
                  message: "Databases with 100+ Gi storage must have at least 3 replicas"
                - rule: "self.engine == 'postgres' ? self.version.startsWith('16') || self.version.startsWith('15') : true"
                  message: "PostgreSQL must be version 15.x or 16.x"

CRD with Status Subresource

import kubernetes
from kubernetes import client, config

class BackupStatusUpdater:
    def __init__(self):
        config.load_incluster_config()
        self.custom_api = client.CustomObjectsApi()

    def update_status(self, namespace: str, name: str,
                      last_backup: str, backup_count: int, healthy: bool):
        status_patch = {
            "status": {
                "lastBackup": last_backup,
                "backupCount": backup_count,
                "conditions": [{
                    "type": "Healthy",
                    "status": "True" if healthy else "False",
                    "lastTransitionTime": last_backup
                }]
            }
        }
        self.custom_api.patch_namespaced_custom_object_status(
            "storage.dodatech.io", "v1", namespace,
            "backups", name, status_patch
        )

    def get_backups_due(self, namespace: str = None) -> list:
        field_selector = None
        backups = self.custom_api.list_namespaced_custom_object(
            "storage.dodatech.io", "v1", namespace or "default",
            "backups"
        )
        return backups.get("items", [])

updater = BackupStatusUpdater()
backups = updater.get_backups_due("default")
print(f"Found {len(backups)} backups")
for bk in backups:
    spec = bk.get("spec", {})
    status = bk.get("status", {})
    print(f"  {bk['metadata']['name']}: "
          f"schedule={spec.get('schedule')}, "
          f"last={status.get('lastBackup', 'never')}")

Expected output:

Found 1 backups
  postgres-prod-backup: schedule=0 2 * * *, last=never

Versioning and Conversion

apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
  name: databases.dodatech.io
spec:
  versions:
    - name: v1
      served: true
      storage: true
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              properties:
                engine: { type: string }
                storageGB: { type: integer }
    - name: v1beta1
      served: true
      storage: false
      schema:
        openAPIV3Schema:
          type: object
          properties:
            spec:
              properties:
                engine: { type: string }
                storage: { type: string }
      # Needs a conversion webhook or no conversion
  conversion:
    strategy: Webhook
    webhook:
      conversionReviewVersions: ["v1", "v1beta1"]
      clientConfig:
        service:
          name: crd-conversion
          namespace: kube-system
          path: /convert

CRD Operations in Python

import kubernetes
from kubernetes import client, config

class CRDClient:
    def __init__(self):
        config.load_incluster_config()
        self.api = client.CustomObjectsApi()

    def list_resources(self, group: str, version: str,
                       plural: str, namespace: str = None):
        if namespace:
            return self.api.list_namespaced_custom_object(
                group, version, namespace, plural
            )
        return self.api.list_cluster_custom_object(
            group, version, plural
        )

    def create_resource(self, group: str, version: str,
                        namespace: str, plural: str,
                        body: dict):
        return self.api.create_namespaced_custom_object(
            group, version, namespace, plural, body
        )

    def patch_resource(self, group: str, version: str,
                       namespace: str, plural: str,
                       name: str, patch: dict):
        return self.api.patch_namespaced_custom_object(
            group, version, namespace, plural, name, patch
        )

    def delete_resource(self, group: str, version: str,
                        namespace: str, plural: str,
                        name: str):
        return self.api.delete_namespaced_custom_object(
            group, version, namespace, plural, name
        )

client = CRDClient()
backup_body = {
    "apiVersion": "storage.dodatech.io/v1",
    "kind": "Backup",
    "metadata": {"name": "test-backup"},
    "spec": {
        "sourcePVC": "test-data",
        "schedule": "0 3 * * *",
        "retention": 14
    }
}
result = client.create_resource(
    "storage.dodatech.io", "v1", "default",
    "backups", backup_body
)
print(f"Created: {result['metadata']['name']}")

Expected output:

Created: test-backup

Printing Columns

Make CRD output readable with kubectl:

additionalPrinterColumns:
  - name: Engine
    type: string
    jsonPath: .spec.engine
  - name: Version
    type: string
    jsonPath: .spec.version
  - name: Replicas
    type: integer
    jsonPath: .spec.replicas
  - name: Status
    type: string
    jsonPath: .status.conditions[0].status
  - name: Age
    type: date
    jsonPath: .metadata.creationTimestamp
kubectl get databases

Expected output:

NAME            ENGINE    VERSION   REPLICAS   STATUS   AGE
my-postgres     postgres  16.2      3          True     2d
my-mysql        mysql     8.4       2          False    1d

Common Mistakes

1. Not Using Status Subresource

Without subresources.status, any user can write to the status field, bypassing the operator. Enable the status subresource so only the operator controller writes status.

2. Missing Validation Rules

Without schema validation, users can create invalid CRs that the operator must handle defensively. Every field should have type, pattern, or enum constraints.

3. Schema Breaking Changes

Removing a field from the CRD schema causes existing CRs to fail validation. Always add a new API version instead. Use conversion Webhooks to translate between versions.

4. Not Setting Short Names

Without shortNames, users type kubectl get backups.storage.dodatech.io. Short names (bk) improve UX: kubectl get bk.

5. Forgetting Scope

If scope is Cluster, the CR is cluster-scoped and doesn't belong to a namespace. If scope is Namespaced, the CR must specify a namespace. Choose based on resource semantics.

6. No Pruning for Unknown Fields

By default, Kubernetes prunes unknown fields from CRs. If you need to preserve unknown fields (e.g., for a generic config CRD), set x-Kubernetes-preserve-unknown-fields: true.

7. Overly Complex Nested Schemas

Deeply nested OpenAPI schemas (5+ levels) become unreadable and hard to validate. Flatten where possible and use x-Kubernetes-validations for cross-field rules.

Practice Questions

1. What is the difference between a CRD and an aggregated API server?

CRDs extend the Kubernetes API without writing a separate API server — only OpenAPI schema is needed. Aggregated API servers (APIServices) run a separate HTTPS server, offer full control over storage and business logic, but require significantly more code to implement.

2. How does CRD validation work?

Each version's openAPIV3Schema defines the expected structure. The API server validates all create and update requests against this schema. Invalid requests are rejected before persistence.

3. What happens when a CRD is deleted?

The CRD definition is removed. All custom resources of that type that still exist in etcd may be permanently lost unless backed up. The garbage collector can cascade-delete all CRs if configured.

4. Why use subresources for status and scale?

Subresources separate user-facing data (spec) from system-managed data (status). This lets operators write status without users accidentally overwriting it. The scale subresource integrates with HPA.

5. Challenge: Design CRDs for a multi-tenant SaaS platform.

Each tenant has databases, caches, and worker queues. Design CRDs that model a Tenant, Database, Cache, and WorkerPool with validation rules ensuring:

  • Tenant names are DNS-1123 labels
  • Database engine must be postgres or mysql
  • Cache memory must be a multiple of 256Mi
  • Worker pool min < max replicas

Mini Project: CRD Validator Script

import json
import re
import jsonschema
from jsonschema import validate, ValidationError

class CRDValidator:
    def __init__(self, crd_schema: dict):
        self.schema = crd_schema

    def validate_resource(self, resource: dict) -> list:
        errors = []
        spec = resource.get("spec", {})

        for field, rules in self._get_spec_rules().items():
            value = spec.get(field)

            if rules.get("required") and value is None:
                errors.append(f"{field}: required")
                continue

            if rules.get("pattern") and value:
                if not re.match(rules["pattern"], str(value)):
                    errors.append(
                        f"{field}: must match {rules['pattern']} "
                        f"(got {value})"
                    )

            if rules.get("minimum") is not None and value is not None:
                if value < rules["minimum"]:
                    errors.append(
                        f"{field}: minimum {rules['minimum']} "
                        f"(got {value})"
                    )

            if rules.get("maximum") is not None and value is not None:
                if value > rules["maximum"]:
                    errors.append(
                        f"{field}: maximum {rules['maximum']} "
                        f"(got {value})"
                    )

            if rules.get("enum") and value not in rules["enum"]:
                errors.append(
                    f"{field}: must be one of {rules['enum']}"
                )

        return errors

    def _get_spec_rules(self) -> dict:
        try:
            props = self.schema["spec"]["openAPIV3Schema"][
                "properties"]["spec"]["properties]
            ]
        except KeyError:
            try:
                props = self.schema["versions"][0][
                    "schema"]["openAPIV3Schema"]["properties"][
                    "spec"]["properties]
                ]
            except KeyError:
                return {}
        return props

crd_schema = {
    "spec": {
        "openAPIV3Schema": {
            "properties": {
                "spec": {
                    "properties": {
                        "engine": {
                            "type": "string",
                            "enum": ["postgres", "mysql"]
                        },
                        "replicas": {
                            "type": "integer",
                            "minimum": 1,
                            "maximum": 10
                        },
                        "schedule": {
                            "type": "string",
                            "pattern": r"^(\d+|\*)(/\d+)?(\s+(\d+|\*)(/\d+)?){4}$"
                        }
                    }
                }
            }
        }
    }
}

validator = CRDValidator(crd_schema)
tests = [
    {"spec": {"engine": "postgres", "replicas": 3, "schedule": "0 2 * * *"}},
    {"spec": {"engine": "sqlite", "replicas": 3}},
    {"spec": {"engine": "postgres", "replicas": 15}},
]

for t in tests:
    errs = validator.validate_resource(t)
    if errs:
        print(f"FAIL: {t['spec']} -> {errs}")
    else:
        print(f"PASS: {t['spec']}")

Expected output:

PASS: {'engine': 'postgres', 'replicas': 3, 'schedule': '0 2 * * *'}
FAIL: {'engine': 'sqlite', 'replicas': 3} -> ['engine: must be one of [\"postgres\", \"mysql\"]']
FAIL: {'engine': 'postgres', 'replicas': 15} -> ['replicas: maximum 10 (got 15)']

FAQ

Can CRDs define behavior like default values?

CRDs support default in OpenAPI v3 schema for simple defaults. For complex defaulting or mutation, use a MutatingAdmissionWebhook. Default values are applied when the resource is created if the field is not set.

How do I rename or remove a CRD field?

Never remove or rename fields in a published CRD version — existing resources will fail validation. Create a new version with the changed field, use a conversion Webhook to translate between versions, and migrate resources gradually.

Can CRDs be used without an operator?

Yes. CRDs can store configuration data that other controllers or applications read. But without an operator, no automated reconciliation happens — the CR is just data. Use a CRD alone for static configuration that tools or scripts consume.

What's Next

Kubernetes Storage Classes Guide
Kubernetes Operators Guide
Helm Charts Tutorial

Congratulations on completing this CRDs guide! Here's where to go from here:

  • Practice daily — Define a CRD for a configuration that you manage manually
  • Build a project — Write an operator that reconciles your CRD
  • Explore related topics — Admission Webhooks, conversion Webhooks, CRD composition, API aggregation
  • Join the community — Share your CRD designs and get feedback

Remember: every expert was once a beginner. Keep extending!

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro