ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2021
DOI: 10.1109/icassp39728.2021.9414411
|View full text |Cite
|
Sign up to set email alerts
|

Machine Translation Verbosity Control for Automatic Dubbing

Abstract: Automatic dubbing aims at seamlessly replacing the speech in a video document with synthetic speech in a different language. The task implies many challenges, one of which is generating translations that not only convey the original content, but also match the duration of the corresponding utterances. In this paper, we focus on the problem of controlling the verbosity of machine translation output, so that subsequent steps of our automatic dubbing pipeline can generate dubs of better quality. We propose new me… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3

Relationship

0
7

Authors

Journals

citations
Cited by 7 publications
(4 citation statements)
references
References 20 publications
0
3
0
Order By: Relevance
“…In particular, attention has been paid to tasks such as multilingual NMT (Johnson et al, 2017), by specifying the target language in the input; formality or politeness transfer (e.g. Sennrich et al (2016)); controlling the gender of the speaker and/or interlocutor (Elaraby et al, 2018;Vanmassenhove et al, 2018;Moryossef et al, 2019); length and verbosity (Lakew et al, 2019;Lakew et al, 2021); or constraining the vocabulary (Ailem et al, 2021).…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…In particular, attention has been paid to tasks such as multilingual NMT (Johnson et al, 2017), by specifying the target language in the input; formality or politeness transfer (e.g. Sennrich et al (2016)); controlling the gender of the speaker and/or interlocutor (Elaraby et al, 2018;Vanmassenhove et al, 2018;Moryossef et al, 2019); length and verbosity (Lakew et al, 2019;Lakew et al, 2021); or constraining the vocabulary (Ailem et al, 2021).…”
Section: Related Workmentioning
confidence: 99%
“…We use the Transformer architecture (Vaswani et al, 2017) implemented in PyTorch (Paszke et al, 2019. Similarly to Lakew et al (2021), we test a range of model alterations.…”
Section: Model Settingsmentioning
confidence: 99%
See 1 more Smart Citation
“…Existing video dubbing works ( Öktem, Farrús, and Bonafonte 2019;Federico et al 2020b;Lakew et al 2021;Virkar et al 2021;Sharma et al 2021;Effendi et al 2022;Lakew et al 2022;Virkar et al 2022;Tam et al 2022) are usually based on a cascaded speech-to-speech translation system (Federico et al 2020a) with ad-hoc designs, mainly concentrating on the Neural Machine Translation (NMT) and Text-To-Speech (TTS) stages. In the NMT stage, related works achieve the length control by assuming that similar number of words/characters should have similar speech length, and therefore encourage a model to generate target sequence with similar number of words/characters to the source sequence (Federico et al 2020a).…”
Section: Introductionmentioning
confidence: 99%