This paper presents Dialogos, a real-time system for human-machine spoken dialogue on the telephone in task-oriented domains. The system has been tested in a large trial with inexperienced users and it has proved robust enough to allow spontaneous interactions both to users which get good recognition performance and to the ones which get lower scores. The robust behavior of the system has been achieved by combining the use of specific language models during the recognition phase of analysis, the tolerance toward spontaneous speech phenomena, the activity of a robust parser, and the use of pragmatic-based dialogue knowledge. This integration of the different modules allows to deal with partial or total breakdowns of the different levels of analysis. We report the field trial data of the system and the evaluation results of the overall system and of the submodules.
Interactions with spoken language systems may present breakdowns that are due to errors in the acoustic decoding of user utterances. Some of these errors have important consequences in reducing the naturalness of human-machine dialogues. In this paper we identify some typologies of recognition errors that cannot be recovered during the syntactico-semantic analysis, but that may be effectively approached at the dialogue level. We will describe how nonunderstanding and the effects of misrecognition are dealt with by Dialogos, a realtime spoken dialogue system that allows users to access a database of railway information by telephone. We will discuss the importance of supporting confirmation turns, and clarification and correction subdialogues. We will show the positive effects of robust dialogue management and dialogue state dependent language modeling, by taking into account both the recognition and understanding performance, and the success rate of dialogue transactions.
This paper describes the approach followed in the development of the linguistic processor of the continuous speech dialog system implemented at our labs. The application scenario (voice-based information retrieval service over the telephone) poses severe specifications to the system: it has to be speakerindependent, to deal with noisy and corrupted speech, and to work in real time. To cope with these types of applications requires to improve both efficiency and accuracy. At present, the system accepts telephone-quality speech (utterances referring to an electronic mailbox access, recorded through a PABX) and, in the speaker-independent configuration, it correctly understands 72% of the utterances in about twice real time. Experimental results are discussed, as obtained from an implementation of the system on a Sun SparcStation 1 using the C language.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.