Skip to content

AWS Fault Injection Service — Testing AWS Workloads

DodaTech Updated 2026-06-21 4 min read

In this tutorial, you'll learn about AWS Fault Injection Service. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

AWS Fault Injection Service (FIS) is a managed Chaos Engineering service that makes it easy to run Fault Injection experiments on AWS workloads. It provides pre-built fault templates for EC2, ECS, EKS, RDS, and other AWS services.

What You Will Learn

This tutorial teaches you how to use AWS FIS to create experiments, define action sequences, set safety controls, and run chaos experiments against your AWS infrastructure.

Why It Matters

AWS FIS removes the need to install and maintain Chaos Engineering tools. It integrates natively with AWS IAM, CloudWatch, and Systems Manager. You can run experiments against EC2 instances, ECS tasks, EKS pods, and RDS databases without any third-party agents.

Real-World Use

DodaTech uses AWS FIS to test the resilience of Durga Antivirus Pro scanning clusters running on EC2 Spot Instances. FIS experiments verify that the cluster can absorb instance terminations without interrupting ongoing malware scans.

Prerequisites

Before starting you should understand:

  • AWS console navigation and IAM basics
  • Chaos Engineering concepts (hypothesis, Steady State, Blast Radius)
  • How EC2, ECS, or EKS workloads are structured in AWS

Step 1: Set Up IAM Permissions

Create an IAM role that allows FIS to perform actions on your resources:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ec2:TerminateInstances",
        "ec2:StopInstances",
        "ec2:RebootInstances]
      ],
      "Resource": "arn:aws:ec2:us-east-1:*:instance/*"
    }
  ]
}

Attach this policy to a role named FISExperimentRole and add FIS as a trusted entity.

Step 2: Create an Experiment Template

Navigate to the AWS FIS console and create an experiment template:

# Using AWS CLI to create an experiment template
aws fis create-experiment-template \
  --cli-input-json file://ec2-stop-template.json

Contents of ec2-stop-template.json:

{
  "description": "Stop a single EC2 instance for 60 seconds",
  "targets": {
    "instanceTarget": {
      "resourceType": "aws:ec2:instance",
      "resourceArns": ["arn:aws:ec2:us-east-1:123456789012:instance/i-0abc123def456"]
    }
  },
  "actions": {
    "stopInstance": {
      "actionId": "aws:ec2:stop-instances",
      "parameters": {},
      "targets": {
        "Instances": "instanceTarget"
      }
    }
  },
  "stopConditions": [
    {
      "source": "aws:cloudwatch:alarm",
      "value": "arn:aws:cloudwatch:us-east-1:123456789012:alarm:FISErrorRateAlarm]
    }
  ],
  "roleArn": "arn:aws:iam::123456789012:role/FISExperimentRole"
}

Expected output:

{
    "experimentTemplate": {
        "id": "ext-abc123def456",
        "description": "Stop a single EC2 instance for 60 seconds"
    }
}

Step 3: Set Stop Conditions

Stop conditions are CloudWatch alarms that halt the experiment automatically:

# Create a CloudWatch alarm that will stop the experiment
aws cloudwatch put-metric-alarm \
  --alarm-name FISErrorRateAlarm \
  --alarm-description "Stop FIS experiment if error rate exceeds 5%" \
  --metric-name ErrorRate \
  --namespace AWS/FIS \
  --statistic Average \
  --period 60 \
  --threshold 5.0 \
  --comparison-operator GreaterThanThreshold \
  --evaluation-periods 1

# Expected output (no output on success)

Step 4: Start the Experiment

Run the experiment from the console or CLI:

aws fis start-experiment \
  --experiment-template-id ext-abc123def456

# Expected output:
# {
#     "experiment": {
#         "id": "exp-xyz789ghi012",
#         "experimentTemplateId": "ext-abc123def456",
#         "state": {
#             "status": "running"
#         }
#     }
# }

Step 5: Monitor the Experiment

Track the experiment status in real time:

aws fis get-experiment \
  --id exp-xyz789ghi012

# Expected output:
# {
#     "experiment": {
#         "id": "exp-xyz789ghi012",
#         "state": {
#             "status": "completed"
#         },
#         "actions": [
#             {
#                 "actionId": "aws:ec2:stop-instances",
#                 "state": "completed]
#             }
#         ]
#     }
# }

Learning Path

flowchart LR
  A[Gremlin Platform] --> B[AWS Fault Injection Service]
  B --> C[Azure Chaos Studio]
  C --> D[Latency Injection]
  D --> E[Fault Injection Proxy]
  style B fill:#f90,color:#fff

Common Errors

  1. Insufficient IAM permissions for the FIS role: The FIS role must have permissions for the actions it will perform. Check the IAM policy carefully.
  2. Missing or misconfigured stop conditions: Without stop conditions an experiment might run longer than intended. Always configure at least one CloudWatch alarm.
  3. Targeting production resources accidentally: Double-check resource ARNs before starting an experiment. Use tags to identify safe targets.
  4. Experiment fails because target resource is already stopped: FIS cannot stop an already stopped instance. Verify the target state before starting.
  5. Cross-region resource ARN mismatch: Ensure resource ARNs match the region where the experiment runs. ARNs are region-specific.

Practice Questions

  1. What IAM role configuration is required for AWS FIS experiments?
  2. How do stop conditions work in AWS FIS?
  3. What AWS resources can you target with FIS experiments?
  4. How do you create an experiment template using the AWS CLI?
  5. What happens when a stop condition alarm is triggered?

Challenge

Create an AWS FIS experiment that terminates one EC2 instance in an Auto Scaling group. Configure a CloudWatch alarm on the groups healthy instance count as a stop condition. Start the experiment and verify that the Auto Scaling group launches a replacement instance within five minutes.

FAQ

What is AWS Fault Injection Service?

AWS FIS is a managed Chaos Engineering service that lets you run Fault Injection experiments on AWS workloads using pre-built templates and safety controls.

Which AWS services are supported by FIS?

FIS supports EC2, ECS, EKS, RDS, DynamoDB, and more. The supported services list grows with each AWS release.

Do I need to install any agents for AWS FIS?

No. FIS uses AWS APIs to inject faults. No agents are required on the target resources.

How does AWS FIS pricing work?

You pay per experiment minute. The first 10 experiment hours per month are free. After that standard AWS pricing applies.

Can AWS FIS roll back experiments automatically?

FIS stops the fault but does not automatically restore the original state. For example stopping an instance does not restart it. Plan recovery actions separately.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro