Generative Imagination Elevates Machine Translation

Long, Quanyu; Wang, Mingxuan; Li, Lei

doi:10.18653/v1/2021.naacl-main.457

Cited by 19 publications

(9 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Calixto et al (2019) introduce a latent variable and estimate a joint distribution over translations and images. Long et al (2020) predict the translation with visual representation generated by a generative adversarial network (GAN) (Goodfellow et al, 2014). The most closely related work to our method is UVR-NMT , which breaks the reliance on bilingual sentence-image pairs.…”

Section: Related Workmentioning

confidence: 99%

Neural Machine Translation with Phrase-Level Universal Visual Representations

Qingkai¹,

Feng²

2022

Preprint

View full text Add to dashboard Cite

Multimodal machine translation (MMT) aims to improve neural machine translation (NMT) with additional visual information, but most existing MMT methods require paired input of source sentence and image, which makes them suffer from shortage of sentence-image pairs. In this paper, we propose a phrase-level retrieval-based method for MMT to get visual information for the source input from existing sentence-image data sets so that MMT can break the limitation of paired sentence-image input. Our method performs retrieval at the phrase level and hence learns visual information from pairs of source phrase and grounded region, which can mitigate data sparsity. Furthermore, our method employs the conditional variational auto-encoder to learn visual representations which can filter redundant visual information and only retain visual information related to the phrase. Experiments show that the proposed method significantly outperforms strong baselines on multiple MMT datasets, especially when the textual context is limited.

show abstract

Section: Related Workmentioning

confidence: 99%

Neural Machine Translation with Phrase-Level Universal Visual Representations

Qingkai¹,

Feng²

2022

Preprint

View full text Add to dashboard Cite

show abstract

Section: Related Workmentioning

confidence: 99%

Imagination-Augmented Natural Language Understanding

Lu¹,

Zhu²,

Wang³

et al. 2022

Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Langua

View full text Add to dashboard Cite

Human brains integrate linguistic and perceptual information simultaneously to understand natural language, and hold the critical ability to render imaginations. Such abilities enable us to construct new abstract concepts or concrete objects, and are essential in involving practical knowledge to solve problems in low-resource scenarios. However, most existing methods for Natural Language Understanding (NLU) are mainly focused on textual signals. They do not simulate human visual imagination ability, which hinders models from inferring and learning efficiently from limited data samples. Therefore, we introduce an Imagination-Augmented Cross-modal Encoder (iACE) to solve natural language understanding tasks from a novel learning perspective-imagination-augmented cross-modal understanding. iACE enables visual imagination with external knowledge transferred from the powerful generative and pretrained vision-and-language models. Extensive experiments on GLUE SWAG (Zellers et al., 2018) show that iACE achieves consistent improvement over visuallysupervised pre-trained models. More importantly, results in extreme and normal few-shot settings validate the effectiveness of iACE in low-resource natural language understanding circumstances. 1

show abstract

“…While most of these studies acquire visual information through retrieval from the web or large-scale image sets, a recent line of studies attempt to generate visual supervision from scratch. The visual information can either be provided in the form of representation (Collell et al, 2017;Long et al, 2021) or concrete images (Gu et al, 2018;Zhu et al, 2021). Though previous studies generate machine imagination, they only tackle specific tasks, such as machine translation (Long et al, 2021) or information retrieval (Gu et al, 2018).…”

Section: Related Workmentioning

confidence: 99%

“…The visual information can either be provided in the form of representation (Collell et al, 2017;Long et al, 2021) or concrete images (Gu et al, 2018;Zhu et al, 2021). Though previous studies generate machine imagination, they only tackle specific tasks, such as machine translation (Long et al, 2021) or information retrieval (Gu et al, 2018). To the best of our knowledge, we are the first to utilize machine abstract imagination from large pretrained vision and language models to improve general NLU tasks.…”

Section: Related Workmentioning

confidence: 99%

Imagination-Augmented Natural Language Understanding

Lu¹,

Zhu²,

Wang³

et al. 2022

Preprint

View full text Add to dashboard Cite

Human brains integrate linguistic and perceptual information simultaneously to understand natural language, and hold the critical ability to render imaginations. Such abilities enable us to construct new abstract concepts or concrete objects, and are essential in involving practical knowledge to solve problems in low-resource scenarios. However, most existing methods for Natural Language Understanding (NLU) are mainly focused on textual signals. They do not simulate human visual imagination ability, which hinders models from inferring and learning efficiently from limited data samples. Therefore, we introduce an Imagination-Augmented Cross-modal Encoder (iACE) to solve natural language understanding tasks from a novel learning perspective-imagination-augmented cross-modal understanding. iACE enables visual imagination with external knowledge transferred from the powerful generative and pre-trained vision-and-language models. Extensive experiments on GLUE SWAG (Zellers et al., 2018) show that iACE achieves consistent improvement over visually-supervised pre-trained models. More importantly, results in extreme and normal few-shot settings validate the effectiveness of iACE in low-resource natural language understanding circumstances. 1

show abstract

Generative Imagination Elevates Machine Translation

Cited by 19 publications

References 29 publications

Neural Machine Translation with Phrase-Level Universal Visual Representations

Neural Machine Translation with Phrase-Level Universal Visual Representations

Imagination-Augmented Natural Language Understanding

Imagination-Augmented Natural Language Understanding

Contact Info

Product

Resources

About