Skip to content

Usability Testing — Methods, Metrics & Tools Guide

DodaTech Updated 2026-06-24 5 min read

Usability testing evaluates how easily real users can accomplish tasks with your product by observing them in action and identifying friction points. In this guide, you will learn moderated and unmoderated testing methods, key metrics like the System Usability Scale (SUS) and task success rate, and how to run remote usability sessions that produce actionable insights. The Doda Browser team runs a biweekly usability study with five participants per cycle, consistently reducing task completion time by 30% across releases.

Learning Path

flowchart LR
  A[Exploratory Testing] --> B[Usability Testing
You are here] B --> C[Session Recording & Analysis] C --> D[A/B Testing] D --> E[Conversion Rate Optimization] style B fill:#f90,color:#fff

Usability Testing Methods

Method Description Best For
Moderated In-Person Facilitator guides the session Rich qualitative feedback
Moderated Remote Video call with screen sharing Distributed participants
Unmoderated Remote Participants complete tasks alone Large sample sizes
Guerrilla Testing Quick tests in public spaces Fast, cheap feedback
Hallway Testing Random colleagues test your product Early-stage validation

Key Usability Metrics

Task Success Rate

The percentage of participants who complete a task successfully:

const tasks = [
  { id: 1, name: "Create account", success: true, time: 45 },
  { id: 2, name: "Reset password", success: false, time: 120 },
  { id: 3, name: "Upload file", success: true, time: 30 },
  { id: 4, name: "Share document", success: true, time: 55 },
  { id: 5, name: "Change settings", success: false, time: 90 },
];

const successRate = (tasks) => {
  const completed = tasks.filter(t => t.success).length;
  return (completed / tasks.length) * 100;
};

console.log(`Task success rate: ${successRate(tasks)}%`);

Expected output:

Task success rate: 60%

System Usability Scale (SUS)

SUS is a ten-item questionnaire giving a single score from 0 to 100:

def calculate_sus(scores):
    odd_sum = sum(scores[i] - 1 for i in range(0, 10, 2))
    even_sum = sum(5 - scores[i] for i in range(1, 10, 2))
    return (odd_sum + even_sum) * 2.5

responses = [4, 3, 5, 2, 4, 4, 5, 3, 4, 4]
print(f"SUS Score: {calculate_sus(responses):.1f}")

Expected output:

SUS Score: 72.5

A SUS score above 68 is above average. Above 80 is excellent.

Time on Task

Measure how long participants take to complete each task:

tasks_time = [45, 120, 30, 55, 90]
avg = sum(tasks_time) / len(tasks_time)
min_t = min(tasks_time)
max_t = max(tasks_time)
print(f"Avg: {avg:.0f}s, Min: {min_t}s, Max: {max_t}s")

Expected output:

Avg: 68s, Min: 30s, Max: 120s

Running a Remote Usability Session

Preparation Checklist

□ Define 5–7 tasks covering critical user journeys
□ Recruit 5 participants matching your target audience
□ Prepare a consent form and task list
□ Set up recording (screen + audio + face)
□ Test your remote session tool (Zoom, Lookback, UserTesting)

Session Script

## Moderator Script

**Introduction (2 min)**
"Thank you for joining. We are testing the app, not you.
There are no wrong answers. Please think aloud as you work."

**Warm-up Task (3 min)**
"Please navigate to the dashboard and tell me what you see."

**Core Tasks (30 min)**
1. "Create a new account using your email."
2. "Find the document titled 'Q3 Report' and share it with a colleague."
3. "Change your notification preferences to receive daily digests."
4. "Reset your password using the forgot password link."
5. "Export your data as a CSV file."

**Debrief (5 min)**
"What did you like most? What was frustrating?
Is there anything you would change?"

Analyzing Results

results = [
    {"task": "Create account", "success": True, "time": 45, "frustration": 2},
    {"task": "Find document", "success": True, "time": 30, "frustration": 1},
    {"task": "Share document", "success": False, "time": 120, "frustration": 5},
    {"task": "Change settings", "success": True, "time": 55, "frustration": 3},
    {"task": "Export data", "success": True, "time": 40, "frustration": 2},
]

high_frustration = [r for r in results if r["frustration"] >= 4]
print("Tasks needing redesign:")
for r in high_frustration:
    print(f"  - {r['task']} ({r['time']}s, frustration: {r['frustration']}/5)")

Expected output:

Tasks needing redesign:
  - Share document (120s, frustration: 5/5)

Common Usability Testing Mistakes

1. Testing with the Wrong Participants

Friends and colleagues know too much. Recruit real target users.

2. Leading Questions

"Don't you think this button is hard to find?" — Never lead the participant.

3. Too Many Tasks

Five tasks is the sweet spot. More than seven overwhelms participants.

4. Ignoring the Think-Aloud

If participants go silent, gently prompt: "What are you thinking right now?"

5. Defensive Reactions

When a participant struggles, do not defend the design. Say "Thank you, that is helpful feedback."

Practice Questions

1. What does SUS stand for and what does it measure?

System Usability Scale — a 10-item questionnaire that produces a single score from 0 to 100 representing perceived usability.

2. What is the difference between moderated and unmoderated testing?

Moderated has a live facilitator guiding the session. Unmoderated participants complete tasks independently without a facilitator.

3. How many participants are recommended per usability test round?

Five participants per round is the industry standard — it catches approximately 85% of usability issues.

4. Why should you avoid leading questions during usability testing?

Leading questions bias the participant and produce unreliable feedback. Ask neutral, open-ended questions instead.

Challenge: Design and run a remote usability test for a file-sharing web app. Recruit five participants, prepare five tasks, record sessions, calculate SUS score and task success rate, and produce a report with three actionable recommendations.

FAQ

What is the difference between usability testing and user acceptance testing?

Usability testing evaluates ease of use. User acceptance testing (UAT) verifies the application meets business requirements.

How many participants do I need for statistically significant results?

Five participants per test round catches most issues. For statistical significance, aim for 20+ participants per segment.

What tools can I use for remote usability testing?

Lookback, UserTesting, Maze, Hotjar, and simple Zoom recordings all work well.

How often should usability testing be conducted?

Ideally every sprint or biweekly. At minimum, test before every major release.

What's Next

A/B Testing — Statistical Significance Guide
Exploratory Testing — Session-Based Management
QA Metrics — Measuring Effectiveness

Built by the developers of Doda Browser, DodaZIP, and Durga Antivirus Pro.

Built by the developers of DodaTech

Doda Browser, DodaZIP & Durga Antivirus Pro