Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing 2023
DOI: 10.18653/v1/2023.emnlp-main.449
|View full text |Cite
|
Sign up to set email alerts
|

End-to-End Single-Channel Speaker-Turn Aware Conversational Speech Translation

Juan Pablo Zuluaga-Gomez,
Zhaocheng Huang,
Xing Niu
et al.

Abstract: Conventional speech-to-text translation (ST) systems are trained on single-speaker utterances, and they may not generalize to real-life scenarios where the audio contains conversations by multiple speakers. In this paper, we tackle single-channel multi-speaker conversational ST with an end-to-end and multi-task training model, named Speaker-Turn Aware Conversational Speech Translation, that combines automatic speech recognition, speech translation and speaker turn detection using special tokens in a serialized… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 42 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?