Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation

Fukuda, Ryo; Sudoh, Katsuhito; Nakamura, Satoshi

doi:10.48550/arxiv.2203.15479

Search citation statements

Order By: Relevance

Paper Sections

Select...

Long-form Offline St1

Towards the Long-form Sst Via1

Citation Types

Supporting

Mentioning

Contrasting

Year Published

2022

Publication Types

Select...

Other1

Relationship

Self Cite0

Independent1

Authors

Journals

Cited by 1 publication

(2 citation statements)

References 16 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To this end, researchers tried considering not only the presence of speech but also its length (Potapczyk and Przybysz, 2020;Inaguma et al, 2021;. Later studies tried to avoid VAD and focused on more linguisticallymotivated approaches, e.g., ASR CTC to predict voiced regions Gállego et al (2021) or directly modeling the sentence segmentation (Tsiamas et al, 2022b;Fukuda et al, 2022).…”

Section: Long-form Offline Stmentioning

confidence: 99%

See 1 more Smart Citation

CUNI-KIT System for Simultaneous Speech Translation Task at IWSLT 2022

Polák¹,

Pham²,

Nguyen³

et al. 2022

Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022)

View full text Add to dashboard Cite

In this paper, we describe our submission to the Simultaneous Speech Translation at IWSLT 2022. We explore strategies to utilize an offline model in a simultaneous setting without the need to modify the original model. In our experiments, we show that our onlinization algorithm is almost on par with the offline setting while being 3× faster than offline in terms of latency on the test set. We also show that the onlinized offline model outperforms the best IWSLT2021 simultaneous system in medium and high latency regimes and is almost on par in the low latency regime. We make our system publicly available. 1

show abstract

Section: Long-form Offline Stmentioning

confidence: 99%

“…Drawing inspiration from offline long-form ST, which primarily emphasizes segmentation, we consider direct segmentation modeling the most promising approach (Tsiamas et al, 2022a;Fukuda et al, 2022). The limitation of these approaches is that they do not allow out-of-the-box simultaneous inference.…”

Section: Towards the Long-form Sst Viamentioning

confidence: 99%