Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua 2021
DOI: 10.18653/v1/2021.naacl-main.457
|View full text |Cite
|
Sign up to set email alerts
|

Generative Imagination Elevates Machine Translation

Abstract: There are common semantics shared across text and images. Given a sentence in a source language, whether depicting the visual scene helps translation into a target language? Existing multimodal neural machine translation methods (MNMT) require triplets of bilingual sentence -image for training and tuples of source sentence -image for inference. In this paper, we propose ImagiT, a novel machine translation method via visual imagination. ImagiT first learns to generate visual representation from the source sente… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
9
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 19 publications
(9 citation statements)
references
References 29 publications
0
9
0
Order By: Relevance
“…Calixto et al (2019) introduce a latent variable and estimate a joint distribution over translations and images. Long et al (2020) predict the translation with visual representation generated by a generative adversarial network (GAN) (Goodfellow et al, 2014). The most closely related work to our method is UVR-NMT , which breaks the reliance on bilingual sentence-image pairs.…”
Section: Related Workmentioning
confidence: 99%
“…Calixto et al (2019) introduce a latent variable and estimate a joint distribution over translations and images. Long et al (2020) predict the translation with visual representation generated by a generative adversarial network (GAN) (Goodfellow et al, 2014). The most closely related work to our method is UVR-NMT , which breaks the reliance on bilingual sentence-image pairs.…”
Section: Related Workmentioning
confidence: 99%
“…While most of these studies acquire visual information through retrieval from the web or large-scale image sets, a recent line of studies attempt to generate visual supervision from scratch. The visual information can either be provided in the form of representation (Collell et al, 2017;Long et al, 2021) or concrete images (Gu et al, 2018;Zhu et al, 2021).…”
Section: Related Workmentioning
confidence: 99%
“…While most of these studies acquire visual information through retrieval from the web or large-scale image sets, a recent line of studies attempt to generate visual supervision from scratch. The visual information can either be provided in the form of representation (Collell et al, 2017;Long et al, 2021) or concrete images (Gu et al, 2018;Zhu et al, 2021). Though previous studies generate machine imagination, they only tackle specific tasks, such as machine translation (Long et al, 2021) or information retrieval (Gu et al, 2018).…”
Section: Related Workmentioning
confidence: 99%
“…The visual information can either be provided in the form of representation (Collell et al, 2017;Long et al, 2021) or concrete images (Gu et al, 2018;Zhu et al, 2021). Though previous studies generate machine imagination, they only tackle specific tasks, such as machine translation (Long et al, 2021) or information retrieval (Gu et al, 2018). To the best of our knowledge, we are the first to utilize machine abstract imagination from large pretrained vision and language models to improve general NLU tasks.…”
Section: Related Workmentioning
confidence: 99%