Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics on - EACL '09 2009
DOI: 10.3115/1609067.1609150
|View full text |Cite
|
Sign up to set email alerts
|

Incremental dialogue processing in a micro-domain

Abstract: This paper describes a fully incremental dialogue system that can engage in dialogues in a simple domain, number dictation. Because it uses incremental speech recognition and prosodic analysis, the system can give rapid feedback as the user is speaking, with a very short latency of around 200ms. Because it uses incremental speech synthesis and self-monitoring, the system can react to feedback from the user as the system is speaking. A comparative evaluation shows that naïve users preferred this system over a n… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

2
64
0

Year Published

2009
2009
2022
2022

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 66 publications
(66 citation statements)
references
References 24 publications
2
64
0
Order By: Relevance
“…The second implementation is a vocal system called DictaNum and inspired by NUMBERS dialogue system (Skantze and Schlangen, 2009). It asks the user to dictate a number and then gives a feedback to confirm that it has been well understood.…”
Section: Dictanummentioning
confidence: 99%
“…The second implementation is a vocal system called DictaNum and inspired by NUMBERS dialogue system (Skantze and Schlangen, 2009). It asks the user to dictate a number and then gives a feedback to confirm that it has been well understood.…”
Section: Dictanummentioning
confidence: 99%
“…Second, the AIC is responsible for reporting what has actually been said by the system back to the Discourse Modeller for continuous self monitoring (there is a direct feedback loop as can be seen in Figure 1). This way, the Discourse Modeller may relate what the system says to what the user says on a high resolution time scale (which is necessary for handling phenomena such as backchannels, as discussed in Skantze & Schlangen, 2009 An animated talking head is shown on a display, synchronised with the synthesised speech (Beskow, 2003). The head is making small continuous movements (recorded from real human head movements), giving it a more life-like appearance.…”
Section: Incremental Multimodal Speech Synthesismentioning
confidence: 99%
“…But, until recently, parsing and generation systems have been defined relative to a grammar whose remit is syntactic and semantic analysis of complete sentence-strings. And, even now, though parsing and generation systems are increasingly reflecting incrementality (Atterer and Schlangen 2009;Skantze and Schlangen 2009;Stoness et al 2004), such incrementality must generally come from the processing model, with the grammar defined statically and independently. Yet, to deal with split utterances, parsing/generation systems have to be defined with a flexibility allowing either one to take up from where there has been a switch, despite the fact that both the string preceding or following the switch may fall outside the set of strings licensed as well-formed by the grammar.…”
Section: Introductionmentioning
confidence: 99%