Proceedings of the Second Conference on Machine Translation 2017
DOI: 10.18653/v1/w17-4749
|View full text |Cite
|
Sign up to set email alerts
|

CUNI System for the WMT17 Multimodal Translation Task

Abstract: In this paper, we describe our submissions to the WMT17 Multimodal Translation Task. For Task 1 (multimodal translation), our best scoring system is a purely textual neural translation of the source image caption to the target language. The main feature of the system is the use of additional data that was acquired by selecting similar sentences from parallel corpora and by data synthesis with back-translation. For Task 2 (cross-lingual image captioning), our best submitted system generates an English caption w… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
18
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 16 publications
(18 citation statements)
references
References 16 publications
0
18
0
Order By: Relevance
“…Our proposed model, even though it is textual, produced competitive results with other multimodal models. The mixture-of-experts model outperformed several multimodal models, including another WMT submission [29]- [32]. Even in the out-of-domain dataset of COCO 2017, the mixture-of-experts model also performed reasonably well with a 28.0 BLEU score.…”
Section: Model Specification and Implementation Detailsmentioning
confidence: 90%
“…Our proposed model, even though it is textual, produced competitive results with other multimodal models. The mixture-of-experts model outperformed several multimodal models, including another WMT submission [29]- [32]. Even in the out-of-domain dataset of COCO 2017, the mixture-of-experts model also performed reasonably well with a 28.0 BLEU score.…”
Section: Model Specification and Implementation Detailsmentioning
confidence: 90%
“…The results concentrate around constrained systems, which only allow the use of parallel Multi30k corpus during training. A few studies experiment with using external resources (Calixto et al 2017;Helcl and Libovický 2017;Elliott and Kádár 2017;Grönroos et al 2018) for pretraining the MT system and then fine-tuning it on Multi30k, or directly training the system on the combination of Multi30k and the external resource. Two such unconstrained systems are also reported.…”
Section: Reranking and Retrieval Based Approachesmentioning
confidence: 99%
“…Lastly, if we take a look at the human evaluation rankings conducted throughout the WMT shared tasks, we see that the top three ranks for English→German and English→French are occupied by two unconstrained ensembles (Grönroos et al 2018;Helcl et al 2018b), the MLT Reranking and the DeepGRU (Delbrouck and Dupont 2018) systems in 2018. In 2017, the multiplicative interaction (Caglayan et al 2017a), unimodal NMT reranking (Zhang et al 2017), unconstrained Imagination (Elliott and Kádár 2017), encoder enrichment (Calixto and Liu 2017a) and hierarchical attention (Helcl and Libovický 2017) were ranked as top three, again for both language pairs.…”
Section: Comparison Of Approachesmentioning
confidence: 99%
“…We compare our approach against two state-ofthe-art multimodal translation systems: Caglayan et al (2017) modulate the target language word embeddings by an element-wise multiplication with a learned transformation of the visual data; Helcl and Libovický (2017) use a double attention model that learns to selectively attend to a combination of the source language and the visual data. Table 4 shows the results of the translation experiment.…”
Section: Modelsmentioning
confidence: 99%