Skip to content

LitmusChaos Guide — Cloud-Native Chaos Engineering for Kubernetes

DodaTech Updated 2026-06-23 5 min read

In this tutorial, you'll learn about LitmusChaos Guide. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

Litmus is an open-source Chaos Engineering platform designed for cloud-native environments. It extends Kubernetes with workflow Orchestration, a ChaosHub experiment marketplace, GitOps integration, and automated resilience scoring that helps teams track their progress over time.

What You Will Learn

This tutorial teaches you how to install LitmusChaos, browse and run experiments from ChaosHub, create multi-step chaos workflows, integrate with CI/CD pipelines, and interpret resilience scores.

Why It Matters

LitmusChaos turns Chaos Engineering into a continuous practice that integrates with your existing development workflows. Instead of running manual experiments you define chaos workflows as code, execute them automatically after deployments, and track resilience improvements through quantitative scores.

Real-World Use

DodaTech integrated LitmusChaos into the deployment pipeline for Durga Antivirus Pro. Every Canary Deployment triggers a chaos workflow that must pass before the release reaches more than 10 percent of users. This catches resilience regressions before they affect customers.

Prerequisites

Before starting you should understand:

  • Kubernetes operations and custom resource definitions
  • Chaos Engineering fundamentals (hypothesis, Steady State, blast radius)
  • Basic CI/CD pipeline concepts
  • Helm package manager

Step 1: Install LitmusChaos

Install LitmusChaos using Helm with the Litmus Portal for experiment management.

# Add the LitmusChaos Helm repository
helm repo add litmuschaos https://litmuschaos.github.io/litmus-helm
helm repo update

# Install LitmusChaos infrastructure
helm install litmus litmuschaos/litmus \
  --namespace litmus \
  --create-namespace \
  --set portal.frontend.service.type=NodePort

# Verify all pods are running
kubectl get pods -n litmus

Expected output:

NAME                                     READY   STATUS
litmus-frontend-7d9f8c6b4f-abc1         1/1     Running
litmus-server-5b7c8d9e4f-def2           1/1     Running
mongo-0                                  1/1     Running
chaos-exporter-c6b7d8e9f-ghi3           1/1     Running

Step 2: Access the Litmus Portal

Port-forward to access the Litmus web UI and create your first project.

kubectl port-forward svc/litmus-frontend-service 9091:9091 -n litmus

# Expected output:
# Forwarding from 127.0.0.1:9091 -> 9091

Open http://localhost:9091 in your browser. Create an admin account, then create a project with a name that matches your team or service.

Step 3: Run an Experiment from ChaosHub

ChaosHub is Litmus's built-in marketplace of pre-built chaos experiments. Browse and execute experiments without writing YAML from scratch.

# List available experiments using litmusctl
litmusctl get experiments

# Expected output:
# EXPERIMENT NAME                    CHAOSHUB
# pod-delete                         litmuschaos
# node-cpu-hog                       litmuschaos
# network-loss                       litmuschaos
# kubelet-service-kill               litmuschaos
# pod-autoscaler                     litmuschaos
# Run a pod-delete experiment on a target deployment
litmusctl create experiment pod-delete \
  --target-namespace default \
  --app-label app=nginx \
  --duration 30s

# Expected output:
# Experiment pod-delete scheduled successfully

Step 4: Create a Multi-Step Chaos Workflow

LitmusChaos supports workflows that chain multiple experiments sequentially or in parallel.

# chaos-workflow.yaml
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosWorkflow
metadata:
  name: post-deployment-workflow
  namespace: litmus
spec:
  workflow:
    steps:
      - name: pod-delete-test
        template: pod-delete
      - name: network-loss-test
        template: network-loss
        dependsOn:
          - pod-delete-test
      - name: cpu-hog-test
        template: cpu-hog
        dependsOn:
          - network-loss-test
  templates:
    - name: pod-delete
      experiment: pod-delete
      spec:
        duration: 30s
    - name: network-loss
      experiment: network-loss
      spec:
        duration: 45s
        lossPercent: "30"
    - name: cpu-hog
      experiment: cpu-hog
      spec:
        duration: 60s
        cpuPercent: "80"

Step 5: Integrate with CI/CD Pipeline

Add chaos experiments to your CI/CD pipeline to automatically test resilience after every deployment.

# .github/workflows/chaos-pipeline.yml
name: Chaos Engineering Pipeline
on:
  deployment_status:
    types: [success]
jobs:
  chaos-test:
    runs-on: ubuntu-latest
    steps:
      - name: Install litmusctl
        run: |
          curl -LO https://litmusctl.litmuschaos.io/latest/linux/litmusctl
          chmod +x litmusctl
          sudo mv litmusctl /usr/local/bin/
      - name: Run chaos workflow
        run: |
          litmusctl create workflow \
            --file chaos-workflow.yaml \
            --project-id ${{ secrets.LITMUS_PROJECT_ID }}
      - name: Check resilience score
        run: |
          litmusctl get resilience-score \
            --workflow post-deployment-workflow

Expected resilience score output:

Resilience Score: 92/100
Result: PASS (threshold: 80/100)

Learning Path

flowchart LR
  A[Chaos Mesh] --> B[LitmusChaos]
  B --> C[Gremlin]
  C --> D[AWS Fault Injection]
  D --> E[Azure Chaos Studio]
  style B fill:#f90,color:#fff

Common Errors

  1. Not connecting a chaos infrastructure agent before running experiments: The agent must be installed on the target cluster. Without it experiments fail with infrastructure unavailable errors.
  2. Skipping probe configuration in experiments: Without probes LitmusChaos cannot verify Steady State during the experiment. Always configure HTTP or command probes.
  3. Running workflows without sequential dependencies: Multiple experiments running simultaneously make it impossible to attribute degradation to a specific fault. Use dependsOn for sequential execution.
  4. Forgetting to label namespaces as chaos targets: LitmusChaos uses namespace labels to identify safe targets. Label namespaces with <a href="/chaos-engineering/litmuschaos/">LitmusChaos</a>.io/chaos: enabled.
  5. Overlooking the ChaosHub experiment version: ChaosHub experiments receive updates. Pin specific versions in your workflow to avoid unexpected changes.

Practice Questions

  1. What is ChaosHub and how does it simplify experiment creation?
  2. How do you create a sequential chaos workflow in LitmusChaos?
  3. What is a resilience score and how is it calculated?
  4. How do you integrate LitmusChaos with a GitHub Actions CI/CD pipeline?
  5. What are probes and why are they critical for experiment safety?

Challenge

Set up a LitmusChaos workflow that runs three sequential experiments after a deployment: pod-delete, network-loss with 30 percent packet loss, and cpu-hog at 80 percent utilization. Configure HTTP probes to verify service health throughout. Each experiment should only proceed if the previous one passes. Integrate the workflow into a GitHub Actions pipeline and verify the resilience score.

FAQ

What is LitmusChaos?

LitmusChaos is an open-source cloud-native Chaos Engineering platform featuring workflow Orchestration, a ChaosHub experiment marketplace, GitOps integration, and automated resilience scoring.

How does LitmusChaos compare to Chaos Mesh?

LitmusChaos focuses on workflow Orchestration and CI/CD integration while Chaos Mesh provides finer-grained fault types. Teams commonly use both together for comprehensive coverage.

Can LitmusChaos run experiments on non-Kubernetes targets?

Yes. LitmusChaos supports Linux machines, AWS instances, and Azure VMs through its infrastructure agents, extending chaos experiments beyond Kubernetes.

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro