LitmusChaos — Cloud-Native Chaos Engineering
In this tutorial, you'll learn about LitmusChaos. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.
Litmus is an open-source Chaos Engineering platform built for cloud-native environments. It extends Kubernetes with chaos workflows, automated resilience scores, and GitOps integration for running experiments as part of your deployment pipeline.
What You Will Learn
This tutorial teaches you how to install LitmusChaos, create chaos experiments using ChaosHubs, and automate Resilience Testing in your CI/CD workflows.
Why It Matters
LitmusChaos brings Chaos Engineering into the development lifecycle. Instead of running ad-hoc experiments you define resilience tests as code and execute them automatically when deployments happen. This catches regressions before they reach production.
Real-World Use
DodaTech integrates LitmusChaos into the deployment pipeline for Durga Antivirus Pro. Every Canary Deployment triggers a set of chaos experiments that must pass before the release rolls out to more than 10 percent of users.
Prerequisites
Before starting you should understand:
- Kubernetes operations and custom resource definitions
- Chaos Engineering fundamentals (steady state, hypothesis, Blast Radius)
- Basic CI/CD pipeline concepts
Step 1: Install LitmusChaos
Use the Litmus CLI or Helm chart to install the platform:
# Install LitmusChaos using Helm
helm repo add litmuschaos https://litmuschaos.github.io/litmus-helm
helm install litmus litmuschaos/litmus \
--namespace litmus \
--create-namespace \
--set portal.frontend.service.type=NodePort
# Check installation status
kubectl get pods -n litmus
# Expected output:
# NAME READY STATUS
# litmus-frontend-7d9f8c6b4f-abc1 1/1 Running
# litmus-server-5b7c8d9e4f-def2 1/1 Running
# mongo-0 1/1 Running
Step 2: Access the Litmus Portal
Port-forward to access the Litmus web UI:
kubectl port-forward svc/litmus-frontend-service 9091:9091 -n litmus
# Expected output:
# Forwarding from 127.0.0.1:9091 -> 9091
Open http://localhost:9091 in your browser. Create an admin account and set up a project.
Step 3: Create a Chaos Experiment from a ChaosHub
ChaosHub is Litmus's experiment marketplace. Browse and select pre-built experiments:
# Using litmusctl CLI to list experiments
litmusctl get experiments
# Expected output:
# EXPERIMENT NAME CHAOSHUB
# pod-delete litmuschaos
# node-cpu-hog litmuschaos
# network-loss litmuschaos
# kubelet-service-kill litmuschaos
# pod-autoscaler litmuschaos
Step 4: Run a Pod Delete Experiment
Execute a Chaos Experiment directly from the CLI:
# Create a pod-delete experiment
litmusctl create experiment pod-delete \
--target-namespace default \
--app-label app=nginx \
--duration 30s
# Expected output:
# ✅ Experiment pod-delete scheduled successfully
# Run 'litmusctl get experiments' to check status
Monitor the experiment execution:
litmusctl get experiments --status running
# Expected output:
# NAME TYPE STATUS TARGET
# pod-delete-x7k3 pod-delete Running default/nginx
Step 5: Integrate with CI/CD Pipeline
Create a pipeline that runs chaos experiments after deployment:
# litmus-chaos-pipeline.yaml
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosSchedule
metadata:
name: post-deployment-chaos
spec:
schedule:
type: now
chaosInfra: default
experiments:
- name: pod-delete
spec:
tasks:
- definition:
probe:
- name: http-probe
type: http
httpProbe/inputs:
url: http://nginx-service:80
expectedResponseCode: "200"
Expected output when applied:
kubectl apply -f litmus-chaos-pipeline.yaml
chaosschedule.litmuschaos.io/post-deployment-chaos created
Learning Path
flowchart LR A[Chaos Mesh] --> B[LitmusChaos] B --> C[Gremlin Platform] C --> D[AWS Fault Injection] D --> E[Azure Chaos Studio] style B fill:#f90,color:#fff
Common Errors
- Not setting up a ChaosHub before running experiments: ChaosHub provides the experiment definitions. Without it LitmusChaos has no experiments to run.
- Forgetting to configure the chaos infrastructure agent: The chaos agent runs on the target cluster. If it is not connected experiments will fail.
- Running experiments on namespaces without proper labels: LitmusChaos uses namespace labels to determine safe targets. Ensure namespaces are labeled correctly.
- Overlooking probe configurations: Probes verify steady state during the experiment. Without probes you cannot confirm the system remained healthy.
- Scheduling too many concurrent experiments: Running multiple experiments simultaneously makes it impossible to isolate the cause of any observed degradation.
Practice Questions
- What is ChaosHub in LitmusChaos?
- How do you create a Chaos Experiment using the Litmus CLI?
- What is the role of the chaos infrastructure agent?
- How can LitmusChaos be integrated into a CI/CD pipeline?
- What are probes and why are they important in Litmus experiments?
Challenge
Set up a LitmusChaos workflow that runs three sequential experiments after a deployment: pod-delete, network-loss, and cpu-hog. Each experiment must only proceed if the previous one passes. Configure HTTP probes to verify service health throughout.
FAQ
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro