Interspeech 2022 2022
DOI: 10.21437/interspeech.2022-566
|View full text |Cite
|
Sign up to set email alerts
|

Turn-Taking Prediction for Natural Conversational Speech

Abstract: While a streaming voice assistant system has been used in many applications, this system typically focuses on unnatural, one-shot interactions assuming input from a single voice query without hesitation or disfluency. However, a common conversational utterance often involves multiple queries with turn-taking, in addition to disfluencies. These disfluencies include pausing to think, hesitations, word lengthening, filled pauses and repeated phrases. This makes doing speech recognition with conversational speech,… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 8 publications
references
References 19 publications
0
0
0
Order By: Relevance