2017
DOI: 10.1515/pralin-2017-0020
|View full text |Cite
|
Sign up to set email alerts
|

Unraveling the Contribution of Image Captioning and Neural Machine Translation for Multimodal Machine Translation

Abstract: Recent work on multimodal machine translation has attempted to address the problem of producing target language image descriptions based on both the source language description and the corresponding image. However, existing work has not been conclusive on the contribution of visual information. This paper presents an in-depth study of the problem by examining the differences and complementarities of two related but distinct approaches to this task: textonly neural machine translation and image captioning. We a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
3
2

Relationship

2
3

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 10 publications
0
7
0
Order By: Relevance
“…Similarly to Lala et al (2017), our oracle experiments on the validation data showed that rescoring of the decoded beam of width 100 has the potential of improvement of up to 3 METEOR points. In the oracle experiment, we always chose a sentence with the highest sentence-level BLEU score.…”
Section: Beam Rescoringmentioning
confidence: 70%
“…Similarly to Lala et al (2017), our oracle experiments on the validation data showed that rescoring of the decoded beam of width 100 has the potential of improvement of up to 3 METEOR points. In the oracle experiment, we always chose a sentence with the highest sentence-level BLEU score.…”
Section: Beam Rescoringmentioning
confidence: 70%
“…Focusing on MMT, Lala et al (2017) show that, given reliable image information in the form of captions, an ideal MMT system would be able to significantly benefit and obtain better translations. Vinyals et al (2016) and Karpathy et al (2016) present an analysis of lexical and syntactic properties of the generated captions.…”
Section: Studying Visual Representationsmentioning
confidence: 99%
“…Focusing on MMT, Lala et al Lala et al (2017) show that, given reliable image information in the form of captions, an ideal MMT system would be able to significantly benefit and obtain better translations.…”
Section: Background and Related Workmentioning
confidence: 99%
“…Multimodal content is gaining popularity in machine translation (MT) community due to its appealing chances to improve translation quality and its usage in commercial applications such as image caption translation for online news articles or machine translation for e-commerce product listings [1,2,3,4]. Although the general performance of neural machine translation (NMT) models is very good given large amounts of parallel texts, some inputs can remain genuinely ambiguous, especially if the input context is limited.…”
Section: Introductionmentioning
confidence: 99%