Fuzz Testing: Automating Input Validation and Security Checks
In this tutorial, you'll learn about Fuzz Testing: Automating Input Validation and Security Checks. We cover key concepts, practical examples, and best practices.
Fuzz testing is an automated testing technique that feeds invalid, unexpected, or random data to a program to discover crashes, memory leaks, assertion failures, and security vulnerabilities that traditional tests miss.
What You'll Learn
In this tutorial, you'll learn mutation-based and generation-based fuzzing techniques, how to use tools like AFL and libFuzzer, and how to integrate fuzz testing into your CI/CD pipeline for continuous security validation.
Why This Matters
Traditional unit tests verify expected behavior. Fuzz testing finds unexpected behavior — the edge cases no developer thought to test. Google's OSS-Fuzz project has found over 30,000 bugs in critical open-source software. Doda Browser uses fuzz testing on its HTML parser to catch malformed input that could crash the rendering engine, ensuring users never experience a tab crash from corrupted web pages.
Learning Path
flowchart LR A[Security Testing] --> B[Fuzz Testing
You are here] B --> C[Mutation-Based Fuzzing] B --> D[Generation-Based Fuzzing] C --> E[AFL Setup] D --> F[libFuzzer] E --> G[CI/CD Integration] style B fill:#f90,color:#fff
Mutation-Based Fuzzing
Mutation fuzzing takes valid input and applies small random changes — bit flips, byte swaps, insertions, deletions — to create test cases. It works well when you have a corpus of valid inputs.
Example: Fuzzing a JSON Parser with AFL
// json_parser.c
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int parse_json(const char *data, size_t size) {
if (size < 2) return -1;
if (data[0] != '{') return -1;
if (data[size - 1] != '}') return -1;
// Count matching braces (simplified)
int depth = 0;
for (size_t i = 0; i < size; i++) {
if (data[i] == '{') depth++;
if (data[i] == '}') depth--;
if (depth < 0) return -1;
}
if (depth != 0) return -1;
return 0;
}
To fuzz this with AFL, you compile with AFL's instrumentation:
afl-gcc json_parser.c -o json_parser_fuzz
afl-fuzz -i testcases/ -o findings/ ./json_parser_fuzz @@
AFL will start mutating the valid JSON files in testcases/ and monitoring for crashes. Typical output:
american fuzzy lop ++2.58d (json_parser_fuzz) [fast]
┌─ process timing ─────────────────────────────────────┐
│ run time : 0 days, 0 hrs, 5 min, 12 sec │
│ last new path : 0 days, 0 hrs, 0 min, 3 sec │
│ last uniq crash : 0 days, 0 hrs, 2 min, 41 sec │
├─ cycle progress ──────────────────────────────────────┤
│ now processing : 42.53 (236/555) │
│ paths timed out : 0 (0.00) │
├─ stage progress ──────────────────────────────────────┤
│ now trying : splice 2 │
│ stage execs : 128/256 (50.00) │
├─ findings in depth ───────────────────────────────────┤
│ favored paths : 42 │
│ new edges on : 38 │
│ total crashes : 3 │
│ total tmouts : 0 │
└───────────────────────────────────────────────────────┘
Three unique crashes found in five minutes. Each crash represents input that caused unexpected behavior.
Generation-Based Fuzzing
Generation fuzzing creates input from scratch based on a specification of the valid format. It doesn't need a seed corpus.
Example: Fuzzing an HTTP Server with libFuzzer
// http_fuzz.cpp
#include <cstdint>
#include <cstddef>
extern "C" int parse_http_request(const uint8_t *data, size_t size);
extern "C" int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
parse_http_request(data, size);
return 0;
}
The fuzzer generates random byte sequences and passes them to the HTTP parser. The libFuzzer instrumentation tracks code coverage and guides mutation toward unexplored paths.
clang++ -fsanitize=fuzzer http_fuzz.cpp -o http_fuzz
./http_fuzz
Expected output after some time:
INFO: Seed: 12345678
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096
INFO: A corpus is not provided; starting from an empty corpus
#2 pulse cov: 4 ft: 5 corp: 1 exec/s: 1 rss: 28Mb
#13 pulse cov: 8 ft: 12 corp: 3 exec/s: 2 rss: 28Mb
#104 pulse cov: 15 ft: 28 corp: 8 exec/s: 17 rss: 29Mb
#512 NEW cov: 22 ft: 45 corp: 15 exec/s: 85 rss: 30Mb
#2048 NEW cov: 31 ft: 72 corp: 22 exec/s: 341 rss: 32Mb
#8192 pulse cov: 35 ft: 89 corp: 28 exec/s: 1365 rss: 35Mb
=================================================================
==12345== ERROR: libFuzzer: deadly signal
#0 0x4a1b2c in parse_http_request http_fuzz.cpp:12:3
#1 0x4a1b2c in LLVMFuzzerTestOneInput http_fuzz.cpp:17
SUMMARY: libFuzzer: deadly signal
The crash occurred at path #2048, meaning libFuzzer found a new code path that triggered an unhandled signal.
Integrating Fuzzing in CI/CD
# .github/workflows/fuzz.yml
name: Fuzz Testing
on:
schedule:
- cron: '0 6 * * *' # Daily at 6 AM
workflow_dispatch:
jobs:
fuzz:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build with fuzzer
run: |
clang++ -fsanitize=fuzzer,address \
src/parser.cpp -o build/parser_fuzz
- name: Run fuzzer for 10 minutes
run: |
mkdir -p findings
./build/parser_fuzz \
-max_total_time=600 \
-artifact_prefix=findings/ \
corpus/
- name: Upload crashes
uses: actions/upload-artifact@v4
with:
name: crash-artifacts
path: findings/
- name: Check for new crashes
run: |
if [ -n "$(ls -A findings/ 2>/dev/null)" ]; then
echo "Crashes found!" && exit 1
fi
Expected CI output:
Build with fuzzer ......................... (pass)
Run fuzzer for 10 minutes ................ 600s elapsed, 0 crashes
Upload crashes ........................... no artifacts
Check for new crashes .................... (pass)
Types of Fuzzing
| Type | Approach | Best For | Example Tool |
|---|---|---|---|
| Mutation | Modifies existing inputs | When you have a seed corpus | AFL |
| Generation | Creates inputs from spec | When format is well-defined | libFuzzer |
| Grammar-based | Uses a formal grammar | Complex protocols like HTTP | Peach |
| Protocol | Understands protocol structure | Network services | Boofuzz |
Common Errors
1. Fuzzing Without Coverage Feedback
Blind fuzzing (random input without coverage tracking) rarely finds deep bugs. Always use coverage-guided fuzzing when possible.
2. Ignoring Found Crashes
Every fuzzer crash represents a real bug. Triage each one, file a bug, and add the crashing input as a regression test.
3. Fuzzing Only Happy Path Code
The most valuable crashes come from error handling paths, boundary conditions, and resource management code — the parts developers typically under-test.
4. Running Fuzzer Without Address Sanitizer
Many crashes manifest as memory corruption that silently corrupts data. ASan or UBSan catches these. Without sanitizers, you miss most bugs.
5. Stopping After Finding One Crash
The first crash is rarely the only bug. Fix it and continue fuzzing — the next path might reveal five more.
Practice Questions
1. What is the difference between mutation and generation fuzzing? Mutation fuzzing modifies existing valid inputs to create test cases. Generation fuzzing creates inputs from scratch based on a format specification.
2. Why is coverage-guided fuzzing better than random fuzzing? Coverage-guided fuzzing tracks which code paths have been exercised and prioritizes mutations that explore new paths, reaching deeper bugs much faster.
3. What does address sanitizer (ASan) detect? ASan detects memory errors like buffer overflows, use-after-free, and memory leaks at runtime, making fuzzer crashes meaningful and actionable.
4. How do you add a fuzzer crash as a regression test? Save the crashing input file in your test corpus and add it to your unit test suite so the bug is re-tested on every build.
5. What is the OSS-Fuzz project? Google's OSS-Fuzz is a free fuzzing service for open-source projects that runs continuously and automatically files bugs when crashes are found.
Challenge: Write a simple fuzz harness for a CSV parser function that handles quoted fields, escaped characters, and malformed rows. Run it with libFuzzer for 10 minutes and triage any crashes found.
Real-World Task: Fuzz an Image Processing Library
Set up a mutation fuzzing pipeline for an image thumbnail generator. Start with a corpus of valid PNG files, run AFL for 30 minutes, and report any crashes. Then fix the found bugs and add regression tests.
Steps:
- Compile the thumbnail generator with AFL instrumentation
- Create a seed corpus with 5 valid PNG images
- Run AFL with
-t 100for timeout detection - Triage any crashes found
- Add the crashing files as regression tests
FAQ
What's Next
| Tutorial | What You'll Learn |
|---|---|
| Security Testing Guide | Broader security testing techniques |
| CI/CD | Automating all test types including fuzzing |
| Mutation Testing Guide | Testing your tests themselves |
Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.
Built by the developers of DodaTech
Doda Browser, DodaZIP & Durga Antivirus Pro