The Volctrans Neural Speech Translation System for IWSLT 2021

Zhao, Chengqi; Liu, Zhicheng; Jian, Tong; Wang, Tao; Wang, Mingxuan; Ye, Rong; Dong, Qianqian; Cao, Jun; Li, Lei

doi:10.48550/arxiv.2105.07319

Cited by 1 publication

(1 citation statement)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It has led to end-to-end (E2E) systems that have delivered state-of-the-art performance on several benchmarks [2]. In certain instances, cascaded systems that use automatic speech recognition (ASR) followed by a machine translation (MT) model, have delivered a better performance when optimized by techniques such as back translation [3]. Speech translation also benefits from multilingual models which enable knowledge transfer across various languages and alleviate the issue of data scarcity.…”

Section: Introductionmentioning

confidence: 99%

Multilingual Simultaneous Speech Translation

Subramanya¹,

Niehues²

2022

Preprint

View full text Add to dashboard Cite

Applications designed for simultaneous speech translation during events such as conferences or meetings need to balance quality and lag while displaying translated text to deliver a good user experience. One common approach to building online spoken language translation systems is by leveraging models built for offline speech translation. Based on a technique to adapt end-to-end monolingual models, we investigate multilingual models and different architectures (end-to-end and cascade) on the ability to perform online speech translation. On the multilingual TEDx corpus, we show that the approach generalizes to different architectures. We see similar gains in latency reduction (40% relative) across languages and architectures. However, the end-to-end architecture leads to smaller translation quality losses after adapting to the online model. Furthermore, the approach even scales to zero-shot directions.

show abstract

Section: Introductionmentioning

confidence: 99%