Proceedings of the 2019 Conference of the North 2019
DOI: 10.18653/v1/n19-3012
|View full text |Cite
|
Sign up to set email alerts
|

Multimodal Machine Translation with Embedding Prediction

Abstract: Multimodal machine translation is an attractive application of neural machine translation (NMT). It helps computers to deeply understand visual objects and their relations with natural languages. However, multimodal NMT systems suffer from a shortage of available training data, resulting in poor performance for translating rare words. In NMT, pretrained word embeddings have been shown to improve NMT of low-resource domains, and a search-based approach is proposed to address the rare word problem. In this study… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0
1

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 8 publications
(8 citation statements)
references
References 12 publications
0
7
0
1
Order By: Relevance
“…They also indicate that the improvement achieved using the pre-trained word embedding decreases as the training data increases. Hirasawa et al [15] introduced an MMT model with embedding prediction that provided substantial performance improvement. However, many studies have proven that the vectors in a pre-trained word embedding distribute unevenly in a narrow conical subspace rather than evenly in the entire space.…”
Section: A Word Embeddingmentioning
confidence: 99%
See 2 more Smart Citations
“…They also indicate that the improvement achieved using the pre-trained word embedding decreases as the training data increases. Hirasawa et al [15] introduced an MMT model with embedding prediction that provided substantial performance improvement. However, many studies have proven that the vectors in a pre-trained word embedding distribute unevenly in a narrow conical subspace rather than evenly in the entire space.…”
Section: A Word Embeddingmentioning
confidence: 99%
“…Their approach achieves not only faster training convergence without decreasing translation performance, but also more accurate translation of rare words. This technique has been extended to an MMT model by [15], with resulting improvement in translation quality.…”
Section: B Monolingual Corpora Augmented Nmtmentioning
confidence: 99%
See 1 more Smart Citation
“…We also have witnessed two proposed categories for MMT from the perspective of cross-modal learning approaches, which either explicitly transform visual features and textual embeddings from one modality to the other at both training and inference (Caglayan et al 2017;Yin et al 2020), or implicitly align the visual and textual modalities to generate vision-aware textual features at training. Unlike the explicit approaches, the implicit cross-modal learning methods do not require images as input at inference, taking the image features as latent variables across different languages (Elliott and Kádár 2017;Calixto, Rios, and Aziz 2019;Hirasawa et al 2019), which also serves as a latent scheme for unsupervised MMT (Lee et al 2018). Despite of the success of plenty of models on Multi30K, an interesting finding is that the visual modality is not fully exploited and only marginally beneficial to machine translation (Caglayan et al 2017;Ive, Madhyastha, and Specia 2019).…”
Section: Mmt Without Text Maskingmentioning
confidence: 99%
“…A monolingual corpus can be collected relatively easily, and has been known to contribute to improved statistical machine translation [2]. Various attempts to employ a monolingual corpus have involved the following: pre-training of a translation model [12], initialization of distributed word representation [4,11], and construction of a pseudo-parallel corpus by back-translation [14].…”
Section: Introductionmentioning
confidence: 99%