"Thinking aloud may be the single most valuable usability engineering method." I wrote this in my 1993 book, Usability Engineering, and I stand by this assessment today. The fact that the same method has remained #1 for 19 years is a good indication of the longevity of usability methods.
Usability guidelines live for a long time; usability methods live even longer. Human behavior changes much more slowly than the technology we all find so fascinating, and the best approaches to studying this behavior hardly change at all.
Defining Thinking Aloud Testing
To define thinking aloud , I'll paraphrase what I said 19 years ago:
Definition: In a thinking aloud test, you ask test participants to use the system while continuously thinking out loud — that is, simply verbalizing their thoughts as they move through the user interface.
("Simply" ought to be in quotes, because it's not that simple for most people to keep up a running monologue. The test facilitator typically has to prompt users to keep them talking.)
To run a basic thinking aloud usability study, you need to do only 3 things:
Recruit representative users.
Give them representative tasks to perform.
Shut up and let the users do the talking.
The method has a host of advantages. Most important, it serves as a window on the soul, letting you discover what users really think about your design. In particular, you hear their misconceptions, which usually turn into actionable redesign recommendations: when users misinterpret design elements, you need to change them. Even better, you usually learn why users guess wrong about some parts of the UI and why they find others easy to use.
The thinking aloud method also offers the benefits of being:
Cheap. No special equipment is needed; you simply sit next to a user and take notes as he or she talks. It takes about a day to collect data from a handful of users, which is all that's needed for the most important insights.
Robust. Most people are poor facilitators and don't run the study exactly according to the proper methodology. But, unless you blatantly bias users by putting words into their mouths, you'll still get reasonably good findings, even from a poorly run study. In contrast, quantitative (statistical) usability studies are ripe with methodology problems and the smallest mistake can doom a study and make the findings directly misleading. Quant studies are also much more expensive.
Flexible. You can use the method at any stage in the development lifecycle, from early paper prototypes to fully implemented, running systems. Thinking aloud is particularly suited for Agile projects. You can use this method to evaluate any type of user interface with any form of technology (although it's a bit tricky to use thinking aloud with speech interfaces — see report on How to Conduct Usability Evaluations for Accessibility for advice on testing with blind or low-vision users who rely on screen readers such as JAWS). Websites, software applications, intranets, consumer products, enterprise software, mobile design: doesn't matter — thinking aloud addresses them all, because we rely on the users doing the thinking.
Convincing. The most hard-boiled developers, arrogant designers, and tight-fisted executives usually soften up when they get direct exposure to how customers think about their work. Getting the rest of your team (and management) to sit in on a few thinking-aloud sessions doesn't take a lot of their time and is the best way to motivate them to pay attention to usability. (For more on how to motivate teams to deliver superior user experiences, see the UX Basic Training course.)
Easy to learn. We teach the basics in a day and provide thorough training in a 3-day course. Of course, this doesn't cover all the twists and advanced modifications needed to hang out your shingle as a usability consultant, but the point is that you don't need these extras to run basic tests for your own design team.
Being cheap and robust are huge upsides of qualitative methods such as thinking aloud. But the flip side is that the method doesn't lend itself to detailed statistics, unless you run a huge, expensive study. You can certainly do this — I simply don't recommend it for the vast majority of projects. Better to conserve your budget and invest in more design iterations.
Unnatural situation. Unless they're a bit weird, most people don't sit and talk to themselves all day. This makes it hard for test participants to keep up the required monologue. Luckily, users are typically quite willing to try their best, and they quickly become so engaged in the test tasks that they all but forget that they're in a study.
Filtered statements (vs. brain dump). Users are supposed to say things as soon as they come to mind rather than reflect on their experience and provide an edited commentary after the fact. However, most people want to appear smart, and thus there's a risk that they won't speak until they've thought through the situation in detail. Don't fall for this trap: it's essential to get the user's raw stream of thought. Typically, you have to prompt users to keep them talking.
Biasing user behavior. Prompts and clarifying questions are usually necessary, but from an untrained facilitator, such interruptions can very easily change user behavior. In such cases, the resulting behavior doesn't represent real use, so you can't base design decisions on the outcome. At the very least, try to identify those cases where you've biased the user so you can discard that small part of the study. (It's worse when you don't know that you've done wrong — then you risk giving the design team bad advice.)
No panacea. That this one method isn't the only usability tool you'll ever need is not a true downside, as long as you are willing to use other methods from time to time. Thinking aloud serves many purposes, but not all purposes. Once you get a few years' experience with usability, you'll want to use a wider range of user research methods.
Don't let the downsides get you down. If you haven't tried it before, go run a quick thinking aloud study on your current design project right now. Because these simplified studies are so cheap, weekly user testing is completely feasible — so if you make a few mistakes the first time, you can always correct them next week.