Optimism is growing that the near future will witness rapid growth in human-computer interaction using voice. System prototypes have recently been built that demonstrate speaker-independent real-time speech recognition, and understanding of naturally spoken utterances with vocabularies of 1000 to 2000 words, and larger. Already, computer manufacturers are building speech recognition subsystems into their new product lines. However, before this technology can be broadly useful, a substantial knowledge base is needed about human spoken language and performance during computerbased spoken interaction. This paper reviews application areas in which spoken interaction can play a significant role, assesses potential benefits of spoken interaction with machines, and compares voice with other modalities of human-computer interaction. It also discusses information that will be needed to build a firm empirical foundation for the design offuture spoken and multimodal interfaces. Finally, it argues for a more systematic and scientific approach to investigating spoken input and performance with future language technology.From the beginning of the computer era, futurists have dreamed of the conversational computer-a machine that we could engage in natural spoken conversation. For instance, Turing's famous test of computational intelligence imagined a computer that could conduct such a fluent English conversation that people could not distinguish it from a human. Despite prolonged research and many notable scientific and technological achievements, there have been few real humancomputer dialogues until recently, and those existing have been keyboard exchanges rather than spoken. This situation has begun to change, however. Steady progress in speech recognition and natural language processing technologies, supported by dramatic advances in computer hardware, has enabled laboratory prototype systems with which one can conduct simple question-answering dialogues. Although far from human-level conversation, this initial capability is generating considerable optimism for the future of humancomputer interaction using voice. This paper aims to identify applications for which spoken interaction is advantageous, to clarify the role of voice with respect to other modalities of human-computer interaction, and to consider obstacles to the successful development and commercialization of spoken language systems.
Hand/Eyes-Busy TasksThe classic situation favoring spoken interaction with machines is one in which the user's hands and/or eyes are busy performing some other task. In such circumstances, by using voice to communicate with the machine, people are free to pay attention to their task, rather than breaking away to use a keyboard. For instance, wire installers, who spoke a wire's serial number and then were guided verbally by the computer to install that wire achieved a 20-30% speedup in productivity, with improved accuracy and lower training time, over their prior manual method ofwire identification and installation (1). Altho...