Proceedings of the 17th International Conference on Spoken Language Translation 2020
DOI: 10.18653/v1/2020.iwslt-1.4
|View full text |Cite
|
Sign up to set email alerts
|

KIT’s IWSLT 2020 SLT Translation System

Abstract: This paper describes KIT's submissions to the IWSLT2020 Speech Translation evaluation campaign. We first participate in the simultaneous translation task, in which our simultaneous models are Transformer-based and can be efficiently trained to obtain low latency with minimized compromise in quality. On the offline speech translation task, we applied our new Speech Transformer architecture to endto-end speech translation. The obtained model can provide translation quality which is competitive to a complicated c… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
23
0

Year Published

2020
2020
2021
2021

Publication Types

Select...
5
1

Relationship

2
4

Authors

Journals

citations
Cited by 18 publications
(23 citation statements)
references
References 13 publications
0
23
0
Order By: Relevance
“…KIT (Pham et al, 2020) participated in the text track only. The authors used a novel readwrite strategy called Adaptive Computation Time (ACT) (Graves, 2016).…”
Section: Submissionsmentioning
confidence: 99%
See 1 more Smart Citation
“…KIT (Pham et al, 2020) participated in the text track only. The authors used a novel readwrite strategy called Adaptive Computation Time (ACT) (Graves, 2016).…”
Section: Submissionsmentioning
confidence: 99%
“…To this aim, they incorporated the inner-attention based architecture proposed by within Transformer-based encoders (inspired by (Tu et al, 2019;Di Gangi et al, 2019c)) and decoders. For the cascade approach, they used a pipeline of three stages: ( 1 KIT (Pham et al, 2020) participated with both end-to-end and cascade systems. For the endto-end system they applied a deep Transformer with stochastic layers (Pham et al, 2019b).…”
Section: Submissionsmentioning
confidence: 99%
“…DiDiLabs (Arkhangorodsky et al, 2020) participated with an end-to-end system based on the S-Transformer architecture proposed in (Di Gangi et al, 2019b,c). The base model trained on MuST-C was extended in several directions by: (1) encoder pre-training on English ASR data, (2) decoder-pre-training on German ASR data, (3) using wav2vec (Schneider et al, 2019) features as inputs (instead of Mel-Filterbank features), and (4) pre-training on English to German text translation with an MT system sharing the decoder with S-Transformer, so to improve the decoder's translation ability.…”
Section: Submissionsmentioning
confidence: 99%
“…Traditional cascaded approach brings together an automatic speech recognition (ASR) module and a machine translation (MT) module. Past work explored connections through 1-best words [3,4], n-best lists [5,6] and lattices [7,8]. Error propagation can be mitigated to some extent with an increasing level of complexity involved in the connection point.…”
Section: Introductionmentioning
confidence: 99%