hyperfine — CLI Benchmarking & Command Timing

Q: What is the difference between `--prepare` and `--warmup`?

`--prepare` runs a command between each benchmark iteration (e.g., clearing caches). `--warmup` runs the command a specified number of times before measurement starts.

DodaTech Updated 2026-06-24 7 min read

In this tutorial, you'll learn about hyperfine. We cover key concepts, practical examples, and best practices to help you understand and apply this topic effectively.

hyperfine is a command-line benchmarking tool that runs shell commands multiple times, performs statistical analysis, handles warm-up caches, and produces clear comparative results with confidence intervals.

What You'll Learn

How to benchmark any CLI command with hyperfine, compare multiple tools and approaches, use parameterized benchmarks to test different inputs, export results to JSON and Markdown for reports, and integrate benchmarking into your development workflow.

Why hyperfine Matters

Optimizing code or choosing between tools requires data. The naive approach — time command — runs once and produces unreliable results due to Caching, thermal throttling, and system noise. hyperfine runs commands 10-100 times, discards warm-up runs, calculates mean and standard deviation, and produces statistically meaningful numbers. DodaZIP's compression team uses hyperfine to benchmark every code change against the previous commit, ensuring performance never regresses.

Learning Path

flowchart LR
  A[mise/asdf] --> B[hyperfine
You are here]
  B --> C[Difftastic]
  B --> D[procs & bottom]
  style B fill:#f90,color:#fff

Installation

# macOS
brew install hyperfine

# Ubuntu/Debian
sudo apt install hyperfine

# Fedora
sudo dnf install hyperfine

# Arch
sudo pacman -S hyperfine

# Download binary
curl -LO https://github.com/sharkdp/hyperfine/releases/download/v1.18.0/hyperfine_1.18.0_amd64.deb
sudo dpkg -i hyperfine_1.18.0_amd64.deb

# Cargo
cargo install hyperfine

Basic Usage

# Benchmark a single command
hyperfine "sleep 1"

# With custom run count
hyperfine -r 50 "ls -la"

# Show output (not hidden)
hyperfine --show-output "echo hello"

Expected output:

Benchmark 1: sleep 1
  Time (mean +- sd):     1.002 s +- 0.003 s    [User: 0.001 s, System: 0.000 s]
  Range (min / max):     1.000 s / 1.010 s     [10 runs]

Comparing Multiple Commands

# Compare two tools
hyperfine "rg 'function' src/" "grep -r 'function' src/"

# Compare three or more
hyperfine "python3 script.py" "node script.js" "deno script.js"

# Compare with custom names
hyperfine --command-name "ripgrep" "rg 'function' src/" \
          --command-name "grep" "grep -r 'function' src/"

Expected output:

Benchmark 1: ripgrep
  Time (mean +- sd):     23.1 ms +- 1.2 ms    [User: 18.5 ms, System: 4.2 ms]
  Range (min / max):     21.5 ms / 27.8 ms     [100 runs]

Benchmark 2: grep
  Time (mean +- sd):    456.2 ms +- 12.3 ms    [User: 412.3 ms, System: 43.5 ms]
  Range (min / max):    441.0 ms / 478.5 ms     [10 runs]

Summary
  'rg '\''function'\'' src/' ran
   19.75 +- 0.67 times faster than 'grep -r '\''function'\'' src/'

Warm-Up Runs

Filesystem Caching heavily affects benchmark results. Use warm-up runs to simulate cache-hot conditions:

# 3 warm-up runs before measurements
hyperfine -w 3 "rg 'function' src/"

# Combine with --prepare to run a command between each run
# Clear caches between runs (requires root)
hyperfine --prepare 'sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' \
          "rg 'function' src/"

Parameterized Benchmarks

Test multiple parameter values in a single run:

# Test with different parameters
hyperfine --parameter-list size 10,100,1000 \
    "python3 -c \"import time; time.sleep({size}/1000)\""

# With multiple parameters
hyperfine --parameter-list tool,rg,grep,ag \
          --parameter-list dir,src,lib \
          "{tool} 'function' {dir}/"

# Export results with parameters
hyperfine --parameter-list num 1,2,4,8 \
          --export-json results.json \
          "python3 worker.py --threads {num}"

Expected parameterized output:

Benchmark 1: python3 -c "import time; time.sleep(10/1000)"
  Time (mean +- sd):     10.1 ms +- 0.3 ms

Benchmark 2: python3 -c "import time; time.sleep(100/1000)"
  Time (mean +- sd):    100.2 ms +- 0.5 ms

Benchmark 3: python3 -c "import time; time.sleep(1000/1000)"
  Time (mean +- sd):   1001.0 ms +- 2.1 ms

Exporting Results

# Export to JSON
hyperfine --export-json benchmark.json \
    "rg 'TODO' src/" "grep -r 'TODO' src/"

# Export to Markdown
hyperfine --export-markdown benchmark.md \
    "rg 'TODO' src/" "grep -r 'TODO' src/"

# Export to CSV
hyperfine --export-csv benchmark.csv \
    "rg 'TODO' src/" "grep -r 'TODO' src/"

JSON Output Structure

{
  "results": [
    {
      "command": "rg 'TODO' src/",
      "mean": 0.0231,
      "stddev": 0.0012,
      "median": 0.0228,
      "user": 0.0185,
      "system": 0.0042,
      "min": 0.0215,
      "max": 0.0278,
      "times": [0.0215, 0.0220, ...],
      "runs": 100
    }
  ]
}

Advanced Options

# Override shell
hyperfine --shell bash "echo hello"
hyperfine --shell fish "echo hello"

# Time limit per run
hyperfine --time-limit 30 "slow-command"

# Minimum time for benchmark (even if run count not reached)
hyperfine --min-runs 5 --time-limit 10 "slow-command"

# Style output
hyperfine --style color "command"
hyperfine --style basic "command"
hyperfine --style nocolor "command"

# Ignore failures (non-zero exit codes)
hyperfine --ignore-failure "command-that-may-fail"

Real-World Examples

Comparing Compression Tools

hyperfine --prepare "rm -f test.tar.gz test.tar.zst test.tar.xz" \
          --parameter-list compressor,gzip,zstd,xz \
          "{compressor} -k test.tar"

Testing Shell Startup Time

hyperfine --runs 30 \
          "zsh -i -c exit" \
          "bash -i -c exit" \
          "fish -i -c exit"

# Compare with/without Oh My Zsh
hyperfine "zsh -i -c exit" "env ZSH_DISABLE_COMPFIX=true zsh -i -c exit"

Benchmarking File Search

hyperfine --warmup 5 \
          "fd -e js -t f 'test' src/" \
          "find src/ -name '*test*.js' -type f"

# With parameterized directory size
hyperfine --parameter-list path src,lib,tests \
          "fd -e js '{path}'"

Database Query Benchmark

hyperfine --prepare "psql -c 'SELECT 1'" \
          --runs 20 \
          "psql -c 'SELECT COUNT(*) FROM users'"

Build Time Comparison

hyperfine --prepare "cargo clean" \
          --runs 3 \
          "cargo build" \
          "cargo build --release"

Integration with CI/CD

#!/bin/bash
# ci-benchmark.sh — Run on every PR

# Benchmark the current build vs main
hyperfine --export-json bench.json \
    "./build/new-binary" \
    "./build/old-binary"

# Parse JSON and check for regression
MEAN_NEW=$(jq '.results[0].mean' bench.json)
MEAN_OLD=$(jq '.results[1].mean' bench.json)

if (( $(echo "$MEAN_NEW > $MEAN_OLD * 1.05" | bc -l) )); then
    echo "Performance regression detected!"
    echo "New: ${MEAN_NEW}s, Old: ${MEAN_OLD}s"
    exit 1
fi

Common Errors

1. "Results are unreliable" Warning

The benchmark has high variance (stddev > 5%). Increase run count with -r and ensure the system is idle. Close background processes, disable wifi, and avoid thermal throttling.

2. Command With Pipes or Redirection Fails

Wrap complex commands in quotes. Use single quotes for the outer command to avoid shell expansion: hyperfine 'rg "pattern" src/ | head -5'.

3. Out-of-Memory During Benchmark

Some commands use more memory than expected. Use --time-limit to cap individual runs and prevent OOM.

4. --prepare Conflicts With Benchmarked Command

The --prepare command runs between each benchmark iteration. It should not interfere with the benchmarked command's output or state.

5. Export File Overwritten

--export-json, --export-markdown, and --export-csv overwrite existing files without warning. Use unique filenames or timestamp them.

6. Parameterized Results Hard to Parse

Use --export-json and Process with jq. The JSON output includes the parameter values for each benchmark entry.

7. very Slow Commands Not Completing

If a command takes minutes, hyperfine respects --time-limit but must wait for the command to finish. Use --min-runs 3 with --time-limit 30 to cap each run.

Practice Questions

1. How is hyperfine different from the time command? time runs the command once. hyperfine runs it multiple times (default 10), provides statistical analysis (mean, stddev), handles warm-up, and supports parameterized comparisons.

2. What does the -w flag do? -w specifies warm-up runs — runs that are not counted in the results but warm up the filesystem cache.

3. How do you compare the performance of two commands? Pass both commands as arguments: hyperfine "rg pattern" "grep -r pattern". hyperfine runs both and shows the speed ratio.

4. How do you export benchmark results in JSON format? hyperfine --export-json results.json "command" — the JSON contains mean, stddev, min, max, and raw times for each command.

5. What causes the "Results are unreliable" warning? High variance across runs (stddev exceeds ~5% of mean). Increase run count, close background processes, and ensure consistent system state.

Challenge: Create a benchmark script that: (1) compares ripgrep vs grep searching for a pattern in a large codebase with and without warm caches, (2) compares Node.js vs Python vs Rust for a simple CPU-bound computation (prime number calculation), (3) compares gzip vs zstd vs xz compression ratio and speed on a 1GB tar file, (4) exports all results to a Markdown report with parameterized benchmarks showing how performance scales with input size.

Can hyperfine benchmark GUI applications?

hyperfine is designed for CLI commands. GUI apps do not terminate automatically. Use background Process management with care.

Does hyperfine work in Docker containers?

Yes — hyperfine works in any Linux, macOS, or WSL environment. Install it in your Docker image and use it as usual.

How many runs should I use?

Start with 10 (default). For low-variance commands (consistent runtime), 10 is enough. For noisy benchmarks, use 30-100 runs.

Can I abort a running benchmark?

Press Ctrl+C. hyperfine stops after the current run completes and prints results from completed runs.

What is the difference between `--prepare` and `--warmup`?

--prepare runs a command between each benchmark iteration (e.g., clearing caches). --warmup runs the command a specified number of times before measurement starts.

What's Next

Difftastic — Structural Diff Tool

procs & bottom — Modern ps & top

ripgrep & fd — Modern File Search

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro. Updated 2026-06-24.

← Previous mise/asdf — Universal Version Manager for Developer Tools Next → Difftastic — Structural Diff Tool for Smarter Code Review

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro

Home Browse Developer Tooling