|
useit.com |
| Search |
Jakob Nielsen's Alertbox, January 27, 2003:
Summary:
Visual interfaces are inherently superior to auditory interfaces for many tasks. The Star Trek fantasy of speaking to your computer is not the most fruitful path to usable systems.
Voice interfaces will not replace screens as the medium of choice for most user interfaces.
Voice interfaces do have a way of capturing the imagination, however. In 1986, I asked a group of 57 computer professionals to predict the biggest change in user interfaces by the year 2000. The top answer was speech I/O, which got twice as many votes as graphical user interfaces.
It may be hard to remember, but in 1986, there was no guarantee that the graphical user interface would win the day. It was mainly used by the "toy-like" Macintosh machines -- not by the "serious" systems used by IT professionals. Now, three years after the prediction target, GUIs are clearly the interface of choice.
I've always thought that Captain Picard would have been much better off with a design that informed him immediately when a shuttle was stolen, without first waiting to be asked.
In any case, what to say is the key issue in interaction design, and the main usability determinant. Whether you say something by speaking or by typing is less important to most users. Thus, voice interfaces will not free us from the most substantial problems of user interface design:
In the future, we may even move up to three-dimensional interfaces, even though 3D is rarely superior to 2D. Animation and other multimedia effects also add to the richness of visual interfaces, though animation is frequently used poorly in today's designs. The bottom line, though, is that visual interfaces can communicate much more information than auditory interfaces whenever users have a monitor and are capable of looking at it.
In the future, we will have many small devices available that are perfectly portable and allow wireless Internet access. The first information appliances are already on the market. And, on occasion, it will be preferable to interact with an information appliance by voice -- such as when your inbound flight was late and you're forced to run through the airport to catch your connecting flight. No time to look at anything, but it would be very useful to have a voice-operated assistant that tells you to "turn left here" or that "the outbound flight has been delayed 10 minutes, so you have time to stop at the Starbucks that's around the next corner."
My new Danger PDA nicely says "new message" when email arrives, but phone calls are announced by a selection of annoyingly funky ring tones that don't remind me of anybody I would actually want to talk to. It would be better to be able to record custom announcements such as "Luice calling" or "it's your Mother."
A voice system's usability increases dramatically according to how much it knows about the surrounding environment. Because voice is less rich than visual displays, voice designers cannot rely on users to pick out important information or create connections between separate data items. Doing so will be the system's responsibility. Contextual design will become important, as will tight management of the user's time -- the computer shouldn't drone on and on about things that are of minimal importance.
I believe that voice interfaces hold their greatest promise as an additional component to a multi-modal dialogue, rather than as the only interface channel. For example, if you have a visual display and mouse available, it would be faster to point to something on the screen and say "red" or "bigger" than to first select the object, then move the mouse to a different screen area to pull down a menu or click a function button that conveys the same information.
Similarly, voice could be used to direct the user's attention to important events or elements on the screen in a richer way than the obnoxious beep that currently constitutes most computers' audio vocabulary. Grow up, computer. You’re not a baby any more, and you can do better than inarticulate beeps.