2019
DOI: 10.1109/tmm.2018.2869276
|View full text |Cite
|
Sign up to set email alerts
|

Multitask Learning for Cross-Domain Image Captioning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
41
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 120 publications
(41 citation statements)
references
References 34 publications
0
41
0
Order By: Relevance
“…Recently, the use of a pretrained model has been explored with the advantage of reducing time and computational cost while preserving efficiency. ese extracted features are passed along to other models such as the language decoder in the visual space-based methods or to a shared model as in the multimodal space for image captioning training [120].…”
Section: Image Captioningmentioning
confidence: 99%
See 1 more Smart Citation
“…Recently, the use of a pretrained model has been explored with the advantage of reducing time and computational cost while preserving efficiency. ese extracted features are passed along to other models such as the language decoder in the visual space-based methods or to a shared model as in the multimodal space for image captioning training [120].…”
Section: Image Captioningmentioning
confidence: 99%
“…Following the where and what analysis of what the model should concentrate on, adaptive attention used a hierarchical structure to fuse both high-level semantic information and visual information from an image to form intuitive representation [120]. e top-down and bottom-up approaches are fused using semantic attention which first defines attribute detectors that dynamically enable it to switch between concepts.…”
Section: (L)mentioning
confidence: 99%
“…Min Yang et al [5] proposed a methodology known as "MLADIC" algorithm for the Cross-Domain Image Captioning. They have introduced and explained the method to reduce the difference between cross-domain such as of the source and the target.…”
Section: Literature Surveymentioning
confidence: 99%
“…For example, Karpathy and Fei-Fei (2015) learned about the inter-model correspondences between language and image data by using the training image-caption pairs. Attention mechanism has been proved to be able to significantly improve the performance of the underlying encoder-decoder based methods (Mun, Cho, and Han 2017;Gu et al 2018;Yang et al 2018;Zhao et al 2018). Mun, Cho, and Han (2017) used associated captions that were retrieved from training data to learn visual attention for image captioning.…”
Section: Related Workmentioning
confidence: 99%