Cascade or Direct Speech Translation? A Case Study

Etchegoyhen, Thierry; Arzelus, Haritz; Gete, Harritxu; Álvarez, Aitor; Torre, Iván G.; Martín-Doñas, Juan M.; González-Docasal, Ander; Fernandez, Edson Benites

doi:10.3390/app12031097

Cited by 4 publications

(1 citation statement)

References 48 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…By systematically comparing the performance of cascade and direct systems across multiple language directions, the study provided valuable insights into the strengths and weaknesses of each approach, contributing to a deeper understanding of speech translation techniques. A comparison was made between state-of-the-art cascade and direct approaches for speech translation in [23]. The focus of the study was the under-resourced language pair of Basque-Spanish, which presents challenging linguistic phenomena, including significant differences in morphology and word order.…”

Section: Related Workmentioning

confidence: 99%

Cascade Speech Translation for the Kazakh Language

Kozhirbayev

Islamgozhayev

2023

Applied Sciences

View full text Add to dashboard Cite

Speech translation systems have become indispensable in facilitating seamless communication across language barriers. This paper presents a cascade speech translation system tailored specifically for translating speech from the Kazakh language to Russian. The system aims to enable effective cross-lingual communication between Kazakh and Russian speakers, addressing the unique challenges posed by these languages. To develop the cascade speech translation system, we first created a dedicated speech translation dataset ST-kk-ru based on the ISSAI Corpus. The ST-kk-ru dataset comprises a large collection of Kazakh speech recordings along with their corresponding Russian translations. The automatic speech recognition (ASR) module of the system utilizes deep learning techniques to convert spoken Kazakh input into text. The machine translation (MT) module employs state-of-the-art neural machine translation methods, leveraging the parallel Kazakh-Russian translations available in the dataset to generate accurate translations. By conducting extensive experiments and evaluations, we have thoroughly assessed the performance of the cascade speech translation system on the ST-kk-ru dataset. The outcomes of our evaluation highlight the effectiveness of incorporating additional datasets for both the ASR and MT modules. This augmentation leads to a significant improvement in the performance of the cascade speech translation system, increasing the BLEU score by approximately 2 points when translating from Kazakh to Russian. These findings underscore the importance of leveraging supplementary data to enhance the capabilities of speech translation systems.

show abstract

Section: Related Workmentioning

confidence: 99%