All terms
Research & Discovery Foundational

Usability Testing

/ˌjuːzəˈbɪlɪti ˈtɛstɪŋ/ · noun

A research method where real users attempt tasks on a product to reveal usability issues.

Usability testing is the practice of observing real people as they attempt to complete specific tasks using your product, then analysing where they succeed, where they struggle, and why. It is the most direct method for answering the question every designer should be asking: “Can people actually use this thing?” The method comes in many forms — moderated or unmoderated, in-person or remote, on a finished product or a paper prototype — but the core mechanic is always the same: give a participant a realistic task, watch what happens, and learn from the gap between what you expected and what occurred.

The classic moderated session involves a facilitator, a participant, and an observer (or recording). The facilitator presents a scenario — “You need to change your delivery address before your order ships” — and then steps back, resisting the urge to guide or hint. The participant thinks aloud as they navigate the interface, narrating their expectations, confusions, and decisions. This think-aloud protocol is where the richest insights live. Analytics can tell you that 40% of users drop off at step three; usability testing tells you they drop off because the “Continue” button looks like a disabled label and they don’t recognise it as a signifier for the next action.

A persistent myth is that usability testing requires large sample sizes to be valid. Research by Nielsen and Landauer demonstrated that five participants typically uncover around 85% of usability problems. This makes usability testing remarkably accessible — you don’t need a dedicated lab, a large budget, or weeks of planning. What you need is clear task scenarios, representative participants, and the discipline to watch without intervening. Even a quick, informal test with three people will surface issues that no amount of internal review or heuristic evaluation would catch.

The output of usability testing is not a statistical report. It’s a prioritised list of observed problems, each grounded in specific user behaviour, and each accompanied by a severity rating and a design recommendation. The best usability test reports include video clips of the critical moments — nothing convinces a sceptical stakeholder faster than watching a real user fail at a task the team assumed was straightforward.

Why it matters

Usability testing is the antidote to assumption-driven design. Every designer, no matter how experienced, carries biases: the curse of knowledge (you can’t unsee what you know about your own product), the false consensus effect (assuming others think the way you do), and the sunk-cost trap (defending decisions you’ve already invested in). Usability testing cuts through all of these by grounding design decisions in observed behaviour rather than opinion.

It also builds organisational empathy. When product managers, engineers, and executives watch usability sessions — either live or via highlight reels — their understanding of the user shifts from abstract persona to concrete person. This shared exposure to user struggle is one of the most powerful cultural tools a design team has. Teams that regularly witness users interacting with their product make different decisions than teams that rely on secondhand reports. They prioritise differently, debate differently, and ship differently.

In practice

  • Testing a wireframe before investing in visual design. An ed-tech team created a clickable wireframe prototype of a new course enrollment flow and tested it with seven students. Three participants couldn’t find the “Enroll” action because it was positioned below a long course description that required scrolling. The team restructured the layout to place the enrollment action in a persistent sidebar — a change that took minutes in the wireframe but would have required significant rework if caught after visual design and development.

  • Comparative testing of two navigation models. A content platform was debating between a sidebar navigation and a top-bar navigation for their new information architecture. Rather than arguing in a meeting, they built both versions as prototypes and ran between-subjects usability tests. Users completed content-finding tasks measurably faster with the sidebar, and think-aloud data revealed that the top-bar approach conflicted with users’ mental models from similar platforms. The data settled the debate in a day.

  • Iterative testing across a user flow. A banking team tested their loan application flow three times across six weeks. The first round revealed that users were confused by jargon in the eligibility step. The second round, after rewriting the copy, showed that users now understood the content but struggled with the document upload interaction. The third round confirmed the upload fix worked and identified one final issue with the confirmation page. Each round cost half a day; together, they transformed a flow with a 34% completion rate into one that hit 78%.