Jakob Nielsen's
Alertbox, December 12, 1999:
Voodoo Usability
The good news is that usability has been recognized as an important element
of Internet success: the average speaker at industry conferences now
promotes good user experience in preference to "cool sites."
The bad news is that most sites employ horribly misguided methodologies
that do not assess real usability. Sometimes the methods are simply
worthless; other times they are directly misleading.
Studying Opinion Instead of Use
Traditional market research methods don't work for the Web. The basic
problem is that one cannot ask users what they want and expect the answer
to have any relation to their actual behavior when they go online.
Focus groups can often be directly misleading. When people
sit around a table and discuss what they might like to see on a site, they
will often focus on superficial aspects and praise fancy features like
animation and Flash effects. But if these same users were ever asked to
actually use the site to accomplish a task, they would usually ignore the
animations and would find that the Flash effects hurt them more than it
helped them.
Self-reported data is extremely weak and three levels removed from the
truth:
- Users tell you what they think you want to hear or what they think is a
socially preferred answer (especially when they are part of a group)
- Users tell you what they remember believing that they did (but memory
is highly fallible, especially regarding the specifics of interaction
behavior)
- Users can only report what they believe they did; not what they
actually did, and people always rationalize their behavior when thinking
about it after the fact; also they don't even notice many of the things
they do
Surveys are just opinion polls: a weak method even though
survey results are reported too-frequently in the trade press.
Just as with focus groups, you get results
three levels removed from the truth. For example, an often-quoted
survey by Zona Research found that 28% of respondents reported finding it
somewhat or extremely difficult to locate products on the Web. I have even
quoted this survey myself,
even though I should know better than to quote the outcome of an
opinion poll. The truth is that observations of actual, real-life user
behavior show that people find the product they are looking for less than
half of the time. Several usability studies have independently confirmed the same
result: on average, users cannot find what they are looking for on
today's Web.
Why this paradox? More than 50% of the time, people can't
find what they are looking for, and yet only 28% of
respondents report problems. In all likelihood, close to 100% of the people
who were polled had encountered a case where they could not find a product
on a website that did sell it. But they may have assumed that the site
didn't carry the product or they may have blamed themselves for not
searching well enough or thoroughly enough. Or they may have found the
product on another site (causing the first site to lose the sale), after
which they thought of their Web experience as having been successful,
even though they actually failed the first time they tried to find the
product. But all they remember is that they did find it in the end.
In a particularly misleading type of survey, a panel of
users is asked to check out a website and fill in a questionnaire with
their opinions of the site. The three methodological hazards in this
method are:
- the users are members of a panel of people who have signed up to be
professional opinion-givers in return for money; they are not
representative of your customers who almost certainly would not have time
to spend on such activities (unless your site is targeting students or the
unemployed)
- being asked to check something out is completely different than having
to use it to accomplish a real task; this is equally true whether the task
is work-related (book an airline ticket for my boss to attend a meeting in
London) or leisure-related (buy a cheap vacation in a city I like in
Europe)
- self-reported behavior and opinions have very little relation to real
behavior and real usability problems
This said, short surveys are still good for simple questions like "why are
you visiting our site" that relate to users' opinions instead of assessing
the design.
Automated Methods Cannot Work
Another category of voodoo services sics a computer program on a site and
produces an automated report that is claimed to measure the site's usability.
Having a computer follow links and count the number of clicks is a very poor
substitute for whether users can actually find what they are looking for.
Real usability depends on which link you click on and how fast you discover
the errors of your ways if you clicked on the wrong one. This cannot be
assessed by computer. A program can count the time needed to follow the optimal
path to the solution, but that's not how the average user behaves.
One wrong word in a menu, and the user is lost for five minutes - or forever.
Simple things like counting clicks to solutions are misleading. For
example, I recently advised on an ecommerce site where people had to find
certain products. The original design provided product pages in 3 clicks
from the home page, and the revised design required one more click. Yet,
shopping success was 7 times higher in the revised design because each of
the new steps was completely intuitive. Even with one more click, the
revised design was faster because users didn't have to spend as much time
thinking about where to click. More importantly, it made people find
the right product much more frequently, whereas the original design was
very error prone. Despite this usability finding, automated assessment would
have given a higher rating to the original design. Whether or not the choices
make sense is the one thing a program can't check.
Another measure typically computed by automated "usability" services is
"freshness" as defined by the percentage of pages that are new.
But you simply cannot tell whether a website is up-to-date by
looking at the time stamps on the files. A site can be extremely fresh even
if 90% of its content is more than a year old. That just means that it
keeps good archives to supplement the current content. By now, there are
probably less than 1% of the pages on nytimes.com that are "current" even
though it is a daily newspaper and one of the freshest sites in the world.
Conversely, a site can be stale even if most of the pages have
been edited recently (if the changes are not the appropriate ones to bring
the content up to date in ways that matter to users).
How do you distinguish between two types of old files:
- good content that should be archived because it is still of value
- outdated content that should be removed or updated
The answer is that you can't tell without understanding the content and the
way it will be used. Even full natural language understanding would not be
sufficient to allow a computer to make this judgment.
Automated usability is downright dangerous because it will cause site managers to
- make the wrong choices since it often gives the wrong advice or causes
them to pursue pseudo-important directions
- think that they are covered and don't need to spend resources on real usability activities
What Can Be Automated?
A few aspects of usability can be assessed automatically by a computer
program:
- Response times: it is not necessary to see or
understand a page in order to measure how long time it takes to download
it. So a computer can provide a perfect estimate of response times. At the
same time, most sites are so incredibly slow these days that it is not
really necessary to track their download times to the millisecond. Instead
of spending big bucks on a
response time measurement service, simply ask
the CEO to download the home page from his or her hotel room while logged
in on a laptop on the next business trip. Anybody trying this simple
exercise will know that the site is too slow and approximately how much the
design needs to slim down.
- HTML validation: a computer can easily flag illegal
HTML code and identify all deviations from the official Web Consortium
standard. Unfortunately, many sites resort to illegal code in an attempt to
code around bugs in
the browsers, so it is still necessary to have a human decide whether a
given instance of illegal HTML was included deliberately or whether it was
a mistake.
- Linkrot can be measured to a first approximation: can
the computer follow the link and get a page returned from the remote site?
Unfortunately, the computer cannot measure whether the page that is
returned is the one the author intended to link to. Some sites give
articles a different URL when they move into archives and reuse the old URL
for new articles. Big mistake since this makes it harder for other
sites to link (and incoming links are the most powerful Web marketing
method). If a site does change URLs around like this, then the linkrot
program may report that the link does work, even if it now links to
something completely irrelevant. Until we get natural language
understanding (in 50 years?), there is no way that a computer can find out
whether a destination page complies with the linking author's intentions.
- Accessibility for users with disabilities can only
be partly measured. Sure, it's possible to have a computer check for things like
use of ALT text for all images, but without natural language comprehension,
the computer cannot determine whether the ALT text will be meaningful to a
blind user and help him or her understand the site. Also, sometimes a page
gets to be more usable by avoiding ALT text on certain images. Thus,
automated measures of accessibility should only be used as a checklist and
not as a final judgment.
How to Gather Usability Data
There is only one valid way to gather usability data: observe real users as
they use your site to accomplish real tasks. This is actually the simplest
of all the methods: just see what happens!
Maybe it's because the method is so simple that it is not used more often.
Anyway, it really is easy to get real usability insights. It's also
very
cheap since you only need to test a small number of users to find the main usability problems.
Learn More
Full-day seminar on
at the annual
Usability Week conference.
Previous: November 28, 1999: Usability as Barrier to Entry
Next: December 26, 1999: Predictions for the Web in 2000
See Also:
Complete list of other Alertbox columns