2019
DOI: 10.1007/978-3-030-15719-7_12
|View full text |Cite
|
Sign up to set email alerts
|

Can Image Captioning Help Passage Retrieval in Multimodal Question Answering?

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
3
1
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(3 citation statements)
references
References 12 publications
0
3
0
Order By: Relevance
“…The application of artificial intelligence techniques to the cultural heritage field has attracted increasing attention in recent years [8,21,22,[28][29][30]39]. Most of these work focus on automatic metadata annotation such as predicting the author, material, and date of an artwork.…”
Section: Introductionmentioning
confidence: 99%
“…The application of artificial intelligence techniques to the cultural heritage field has attracted increasing attention in recent years [8,21,22,[28][29][30]39]. Most of these work focus on automatic metadata annotation such as predicting the author, material, and date of an artwork.…”
Section: Introductionmentioning
confidence: 99%
“…While the extractor obtains salient features in the images, the decoder model which is similar in pattern to the language model utilizes a recurrent neural network to learn sequential information [121]. Most captioning tasks are undertaken in a supervised manner whereby the image features act as the input which are learned and mapped to a textual label [122]. e label captions are first transformed into a word vector and are combined with the feature vector to generate a new textual description.…”
Section: Image Captioningmentioning
confidence: 99%
“…Other approaches have been used to answer questions relying on captions, yet only regarding visual content [29]. The most similar approach to ours is instead [39], which used GPT-3 for VQA.…”
Section: Related Workmentioning
confidence: 99%