Usability Testing — Methods, Metrics & Tools Guide

DodaTech Updated 2026-06-24 5 min read

Usability testing evaluates how easily real users can accomplish tasks with your product by observing them in action and identifying friction points. In this guide, you will learn moderated and unmoderated testing methods, key metrics like the System Usability Scale (SUS) and task success rate, and how to run remote usability sessions that produce actionable insights. The Doda Browser team runs a biweekly usability study with five participants per cycle, consistently reducing task completion time by 30% across releases.

Learning Path

flowchart LR
  A[Exploratory Testing] --> B[Usability Testing
You are here]
  B --> C[Session Recording & Analysis]
  C --> D[A/B Testing]
  D --> E[Conversion Rate Optimization]
  style B fill:#f90,color:#fff

Usability Testing Methods

Method	Description	Best For
Moderated In-Person	Facilitator guides the session	Rich qualitative feedback
Moderated Remote	Video call with screen sharing	Distributed participants
Unmoderated Remote	Participants complete tasks alone	Large sample sizes
Guerrilla Testing	Quick tests in public spaces	Fast, cheap feedback
Hallway Testing	Random colleagues test your product	Early-stage validation

Key Usability Metrics

Task Success Rate

The percentage of participants who complete a task successfully:

const tasks = [
  { id: 1, name: "Create account", success: true, time: 45 },
  { id: 2, name: "Reset password", success: false, time: 120 },
  { id: 3, name: "Upload file", success: true, time: 30 },
  { id: 4, name: "Share document", success: true, time: 55 },
  { id: 5, name: "Change settings", success: false, time: 90 },
];

const successRate = (tasks) => {
  const completed = tasks.filter(t => t.success).length;
  return (completed / tasks.length) * 100;
};

console.log(`Task success rate: ${successRate(tasks)}%`);

Expected output:

Task success rate: 60%

System Usability Scale (SUS)

SUS is a ten-item questionnaire giving a single score from 0 to 100:

def calculate_sus(scores):
    odd_sum = sum(scores[i] - 1 for i in range(0, 10, 2))
    even_sum = sum(5 - scores[i] for i in range(1, 10, 2))
    return (odd_sum + even_sum) * 2.5

responses = [4, 3, 5, 2, 4, 4, 5, 3, 4, 4]
print(f"SUS Score: {calculate_sus(responses):.1f}")

Expected output:

SUS Score: 72.5

A SUS score above 68 is above average. Above 80 is excellent.

Time on Task

Measure how long participants take to complete each task:

tasks_time = [45, 120, 30, 55, 90]
avg = sum(tasks_time) / len(tasks_time)
min_t = min(tasks_time)
max_t = max(tasks_time)
print(f"Avg: {avg:.0f}s, Min: {min_t}s, Max: {max_t}s")

Expected output:

Avg: 68s, Min: 30s, Max: 120s

Running a Remote Usability Session

Preparation Checklist

□ Define 5–7 tasks covering critical user journeys
□ Recruit 5 participants matching your target audience
□ Prepare a consent form and task list
□ Set up recording (screen + audio + face)
□ Test your remote session tool (Zoom, Lookback, UserTesting)

Session Script

## Moderator Script

**Introduction (2 min)**
"Thank you for joining. We are testing the app, not you.
There are no wrong answers. Please think aloud as you work."

**Warm-up Task (3 min)**
"Please navigate to the dashboard and tell me what you see."

**Core Tasks (30 min)**
1. "Create a new account using your email."
2. "Find the document titled 'Q3 Report' and share it with a colleague."
3. "Change your notification preferences to receive daily digests."
4. "Reset your password using the forgot password link."
5. "Export your data as a CSV file."

**Debrief (5 min)**
"What did you like most? What was frustrating?
Is there anything you would change?"

Analyzing Results

results = [
    {"task": "Create account", "success": True, "time": 45, "frustration": 2},
    {"task": "Find document", "success": True, "time": 30, "frustration": 1},
    {"task": "Share document", "success": False, "time": 120, "frustration": 5},
    {"task": "Change settings", "success": True, "time": 55, "frustration": 3},
    {"task": "Export data", "success": True, "time": 40, "frustration": 2},
]

high_frustration = [r for r in results if r["frustration"] >= 4]
print("Tasks needing redesign:")
for r in high_frustration:
    print(f"  - {r['task']} ({r['time']}s, frustration: {r['frustration']}/5)")

Expected output:

Tasks needing redesign:
  - Share document (120s, frustration: 5/5)

Common Usability Testing Mistakes

1. Testing with the Wrong Participants

Friends and colleagues know too much. Recruit real target users.

2. Leading Questions

"Don't you think this button is hard to find?" — Never lead the participant.

3. Too Many Tasks

Five tasks is the sweet spot. More than seven overwhelms participants.

4. Ignoring the Think-Aloud

If participants go silent, gently prompt: "What are you thinking right now?"

5. Defensive Reactions

When a participant struggles, do not defend the design. Say "Thank you, that is helpful feedback."

Practice Questions

1. What does SUS stand for and what does it measure?

System Usability Scale — a 10-item questionnaire that produces a single score from 0 to 100 representing perceived usability.

2. What is the difference between moderated and unmoderated testing?

Moderated has a live facilitator guiding the session. Unmoderated participants complete tasks independently without a facilitator.

3. How many participants are recommended per usability test round?

Five participants per round is the industry standard — it catches approximately 85% of usability issues.

4. Why should you avoid leading questions during usability testing?

Leading questions bias the participant and produce unreliable feedback. Ask neutral, open-ended questions instead.

Challenge: Design and run a remote usability test for a file-sharing web app. Recruit five participants, prepare five tasks, record sessions, calculate SUS score and task success rate, and produce a report with three actionable recommendations.