KIT’s IWSLT 2020 SLT Translation System

Pham, Ngoc-Quan; Schneider, Felix; Nguyen, Tuan-Nam; Ha, Thanh-Le; Nguyen, Thai-Son; Awiszus, Maximilian; Stüker, Sebastian; Waibel, Alexander

doi:10.18653/v1/2020.iwslt-1.4

Cited by 18 publications

(23 citation statements)

References 13 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…KIT (Pham et al, 2020) participated in the text track only. The authors used a novel readwrite strategy called Adaptive Computation Time (ACT) (Graves, 2016).…”

Section: Submissionsmentioning

confidence: 99%

“…To this aim, they incorporated the inner-attention based architecture proposed by within Transformer-based encoders (inspired by (Tu et al, 2019;Di Gangi et al, 2019c)) and decoders. For the cascade approach, they used a pipeline of three stages: ( 1 KIT (Pham et al, 2020) participated with both end-to-end and cascade systems. For the endto-end system they applied a deep Transformer with stochastic layers (Pham et al, 2019b).…”

Section: Submissionsmentioning

confidence: 99%

See 1 more Smart Citation

Findings of the Iwslt 2020 Evaluation Campaign

Ansari¹,

Axelrod²,

Bach³

et al. 2020

Proceedings of the 17th International Conference on Spoken Language Translation

Self Cite

View full text Add to dashboard Cite

The evaluation campaign of the International Conference on Spoken Language Translation (IWSLT 2020) featured this year six challenge tracks: (i) Simultaneous speech translation, (ii) Video speech translation, (iii) Offline speech translation, (iv) Conversational speech translation, (v) Open domain translation, and (vi) Non-native speech translation. A total of 30 teams participated in at least one of the tracks. This paper introduces each track's goal, data and evaluation metrics, and reports the results of the received submissions.

show abstract

“…KIT (Pham et al, 2020) participated in the text track only. The authors used a novel readwrite strategy called Adaptive Computation Time (ACT) (Graves, 2016).…”

Section: Submissionsmentioning

confidence: 99%

Section: Submissionsmentioning

confidence: 99%

Findings of the Iwslt 2020 Evaluation Campaign

Ansari¹,

Axelrod²,

Bach³

et al. 2020

Proceedings of the 17th International Conference on Spoken Language Translation

Self Cite

View full text Add to dashboard Cite

show abstract

“…DiDiLabs (Arkhangorodsky et al, 2020) participated with an end-to-end system based on the S-Transformer architecture proposed in (Di Gangi et al, 2019b,c). The base model trained on MuST-C was extended in several directions by: (1) encoder pre-training on English ASR data, (2) decoder-pre-training on German ASR data, (3) using wav2vec (Schneider et al, 2019) features as inputs (instead of Mel-Filterbank features), and (4) pre-training on English to German text translation with an MT system sharing the decoder with S-Transformer, so to improve the decoder's translation ability.…”

Section: Submissionsmentioning

confidence: 99%

Proceedings of the 17th International Conference on Spoken Language Translation

Federico¹,

Waibel²,

Knight³

et al. 2020

View full text Add to dashboard Cite

The conference chairs and organizers would like to express their gratitude to everyone who contributed and supported IWSLT. Our IWSLT-20 program exceeds all our expectations in quality and breath, particularly when considering the challenges during a pandemic under lock-downs and health and travel restrictions. We thank the challenge track chairs, organizers, and participants, the program chairs and committee members, as well as all the authors that went the extra mile to submit system and research papers to IWSLT, and make this year's conference our most vibrant than ever. We also wish to express our sincere gratitude to ACL for hosting our conference and for arranging the logistics and infrastructure that allow us to hold IWSLT 2020 as a virtual online conference.

show abstract

“…Traditional cascaded approach brings together an automatic speech recognition (ASR) module and a machine translation (MT) module. Past work explored connections through 1-best words [3,4], n-best lists [5,6] and lattices [7,8]. Error propagation can be mitigated to some extent with an increasing level of complexity involved in the connection point.…”

Section: Introductionmentioning

confidence: 99%

Efficient Use of End-to-End Data in Spoken Language Processing

Wang

Gales

2021

ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

View full text Add to dashboard Cite

For many challenging tasks there is often limited data to train the systems in an end-to-end fashion, which has become increasingly popular for deep-learning. However, these tasks can normally be split into multiple separate modules, with significant quantities of data associated with each module. Spoken language processing applications fit into this scenario, as they usually start with a speech recognition module, followed by multiple task specific modules to achieve the end goal. This work examines how the best use can be made of limited end-to-end training for sequence-to-sequence tasks. The key to improving the use of the data is to more tightly integrate the modules via embeddings, rather than simply propagating words between modules. In this work speech translation is considered as the spoken language application. When significant quantities of in-domain, end-to-end data is available, cascade approaches operate well. When the in-domain data is limited, however, tighter integration between modules enables better use of the data to be made. One of the challenges with tighter integration is how to ensure embedding consistency between the modules. A novel form of embedding-passing between modules is proposed that shows improved performance over both cascade and standard embeddingpassing approaches for limited in-domain data.

show abstract

KIT’s IWSLT 2020 SLT Translation System

Cited by 18 publications

References 13 publications

Findings of the Iwslt 2020 Evaluation Campaign

Findings of the Iwslt 2020 Evaluation Campaign

Proceedings of the 17th International Conference on Spoken Language Translation

Efficient Use of End-to-End Data in Spoken Language Processing

Contact Info

Product

Resources

About