2021
DOI: 10.1016/j.ins.2020.11.024
|View full text |Cite
|
Sign up to set email alerts
|

Multi-modal neural machine translation with deep semantic interactions

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
13
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 28 publications
(13 citation statements)
references
References 17 publications
0
13
0
Order By: Relevance
“…However, they mainly focus on textual tasks. They cannot effectively deal with the multi-modal tasks, such as image-text retrieval, image captioning, multimodal machine translation (Lin et al, 2020a;Su et al, 2021) and visual dialog (Murahari et al, 2020).…”
Section: Text Enhance Visionmentioning
confidence: 99%
“…However, they mainly focus on textual tasks. They cannot effectively deal with the multi-modal tasks, such as image-text retrieval, image captioning, multimodal machine translation (Lin et al, 2020a;Su et al, 2021) and visual dialog (Murahari et al, 2020).…”
Section: Text Enhance Visionmentioning
confidence: 99%
“…Both types of features have been used in various vision and language tasks such as multimodal dialogue sentiment analysis (Firdaus et al, 2020), image captioning (Xu et al, 2015;Shi et al, 2021), and multimodal machine translation (Ive et al, 2019;Lin et al, 2020;Su et al, 2021).…”
Section: Image Featuresmentioning
confidence: 99%
“…This is done to learn the bidirectional multi-modal translation simultaneously. Moreover, Su et al (2021) showed that jointly learning text-image interaction instead of modeling them separately using attentional networks is more useful. This result is in line with several state-of-the-art visual transformer related models, such as VisualBERT (Li et al, 2019), UNITER (Chen et al, 2019) etc.…”
Section: Related Workmentioning
confidence: 99%