Summary: It's easy to bias study participants, whether in user testing or in card sorting, if they focus on matching stimulus words instead of working on the underlying problem.
A classic way to ruin usability tests is to give users problems that include the actual command names or navigation labels they're supposed to use.
For example, if you want to test whether people can find and use Excel's "Remove Duplicates" feature, you should not tell them: "You have a list of companies that have previously purchased your product, but some company names appear multiple times. Remove these duplicates." Given this task wording, users will often scan the UI for a label containing the words "remove" and/or "duplicates." Thus, you're not testing whether the label effectively communicates the command's functionality, nor are you testing the communicative benefits of combining the command name with its corresponding icon and tooltip. You're simply testing whether users can match up the terms.
(I've actually seen an even worse task description, created by third-tier usability folks: "Use the Remove Duplicates command to remove the extra copies of each name." When you tell people what features to use, you'll never get them to approach the software naturally . They'll just do as they're told; not what they would normally do.)
A good "Remove Duplicates" task description is: "You have a list of companies that have previously purchased your product, but some company names appear multiple times. Change the spreadsheet so that each company name appears only once." Now, your test participants know their goal and it's presented in a scenario that makes sense — but they can't solve the problem by simply scanning for keywords. Instead, they must search for commands that might help them accomplish the task. This is a much better test of whether the application's user interface supports user goals.
(You should also test broader tasks that can't be solved by a single command. See our full-day course on User Testing for more information on how to write good test tasks and our 2-day course on Application Usability for guidelines on making features discoverable and understandable.)
Keyword Matching in Card Sorting
The keyword-matching problem can also mess up other user research methods, such as card sorting.
A good example comes from our recent project to improve usability on a client's health information site. The site's goal is to offer information about various related diseases and how to deal with them. To further complicate matters, the site has information for both the general public and professionals. So, a key challenge was to determine which organizing principles would be the best top-level structuring principle for the information architecture (IA).
Card sorting is often a good way to get initial insights into users' mental model of an information space, and in our project it did indeed generate good starting point for the IA. After the card-sorting study, we conducted several rounds of user testing of wireframes , further refining the structure and how the site presented it. All of this effort would have been wasted if we'd gotten data on users' keyword-matching skills rather than on how they approach the site's target healthcare issues.
To preserve client confidentiality, I'll show examples of the problems and solutions transposed to another domain — say, agriculture.
In the (fictional) case of our agriculture site, we cover different crops, such as strawberries, raspberries, corn, and wheat. We also cover different activities, such as planting, growing, and harvesting. Finally, we target our content to both professional farmers and people who grow a few plants in their backyard.
So, one option would be to organize the site primarily by crop, and secondarily by activity. Let's call this IA #1 :
Alternatively, we could organize the site primarily by activity, and secondarily by crop. Let's call this IA #2 :
Let's say that we did a card-sorting study to help us determine the initial IA for our first wireframe. Here are two possible sets of cards:
|Card Set A||Card Set B|
|Strawberry Planting||Planting Strawberries|
|Strawberry Growing||Growing Strawberries|
|Strawberry Harvesting||Harvesting Strawberries|
|Wheat Planting||Planting Wheat|
|Wheat Growing||Growing Wheat|
|Wheat Harvesting||Harvesting Wheat|
Clearly, giving either card set to users would severely bias the sorting exercise. Given Set A, most users would generate IA#1, sorting all "strawberry" cards together, all wheat cards together, and so on. Likewise, Set B would typically result in IA#2, as most users would sort cards by activity ("planting," "growing," etc.).
Solution: Synonyms and Non-Parallel Structures
To get a better outcome, our card-sorting study has to make users work harder and really think about how they'd approach the concepts written on the cards.
Obviously, this is opposite of our goals in usability, where we typically want to make tasks easier and reduce the users' cognitive load. But remember: card sorting isn't a user interface design; it's a knowledge elicitation exercise to discover users' mental models. So it's okay to reduce the usability of the cards , because people don't actually use them in the real UI.
(Of course, you can't reduce the cards' usability so much that test participants don't understand them at all, because then their sorting will be misleading.)
One way to avoid having participants simply match up keywords is to use different words for a single concept — that is, introduce synonyms . For example, instead of saying "harvesting strawberries," we could say "picking strawberries."
To further mix it up, we could have a "picking Fragaria " card using the scientific name for the strawberry genus. I wouldn't actually do this unless the site were targeted at professional botanists, but in the healthcare example (our actual client project), we could use both common names and medical names for the same condition on different cards and most users would still understand them.
Another way to ruin users' keyword-matching ability is to employ non-parallel exposition structures . For example, one card could read "Planting Corn" whereas another could read "Wheat Planting."
Employing such tactics violates one of the traditional Web writing guidelines , which says that parallel structures are faster to scan, and easier to compare and contrast and thus better for presenting lists of items. Again, "bad design" is okay, because we're not targeting optimal usability of the cards. We want users to stop in their tracks and think, rather than simply finish the task quickly.
Avoiding Study Bias
I usually say user testing is easy : basically, you get some real customers and watch them use your site or app. But this article touches on one of the difficulties of running great studies: minimizing bias. To achieve this, you have to see how people behave on their own rather than impose your own thinking on them. In the latter case, they simply echo it back, and you don't learn how to improve your design for real-life use.
From many years of teaching usability methods, I've learned that two of usability's biggest challenges are to write good test tasks and facilitate test sessions in a neutral manner. As our (sanitized) case study here shows, it also can be hard to avoid biasing users in card-sorting research.
The beauty of usability is that the methods are so robust that they generate useful findings even if you use them wrong. This is particularly true for user testing: any time you watch customers, you'll learn something to increase your website's profitability. But, if you do usability right, you'll learn even more. And once you're aware of the potential for biasing users, you can reduce the bias and thus increase your research's value.