2022 6th International Conference on Intelligent Computing and Control Systems (ICICCS) 2022
DOI: 10.1109/iciccs53718.2022.9788253
|View full text |Cite
|
Sign up to set email alerts
|

BERT based Multiple Parallel Co-attention Model for Visual Question Answering

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(4 citation statements)
references
References 7 publications
0
4
0
Order By: Relevance
“…VGGNet LSTM+Q+I [1], AVWAN [2],SAN [6],Facts-VQA [70],DPP [75], QAM [76], Region-Sel [77], NMN [78], ResNet FDA [5], Bayesian [7], Dense-Sym [8], Code-Mix VQA [9], Hei-Co-atten [10], Rich-img-Region [27], MCB [29], MRN [30] , FVTA [33], MUTAN [36], Meta-VQA [77],Rich-VQA [79], QTA [80], , DCN [81], GoogleNet Neural Image QA [80], Multi-Modal QA [82] , i-Bowing [83], Smem [84] F-RCNN Code-Mixed VQA [9], CAQT [11], QLOB [12], BAN [28] , MFB [32], [85] ,explicit-know-Based [86] , Know-Base Graph [87] BERT VilBERT [13], LXMERT [14] , UNITER [15], Oscar [16], MPC [25], Semantic VLBERT [88] Source: Own elaboration. The next step in the VQA model is to extract question features.…”
Section: Methods Papermentioning
confidence: 99%
See 2 more Smart Citations
“…VGGNet LSTM+Q+I [1], AVWAN [2],SAN [6],Facts-VQA [70],DPP [75], QAM [76], Region-Sel [77], NMN [78], ResNet FDA [5], Bayesian [7], Dense-Sym [8], Code-Mix VQA [9], Hei-Co-atten [10], Rich-img-Region [27], MCB [29], MRN [30] , FVTA [33], MUTAN [36], Meta-VQA [77],Rich-VQA [79], QTA [80], , DCN [81], GoogleNet Neural Image QA [80], Multi-Modal QA [82] , i-Bowing [83], Smem [84] F-RCNN Code-Mixed VQA [9], CAQT [11], QLOB [12], BAN [28] , MFB [32], [85] ,explicit-know-Based [86] , Know-Base Graph [87] BERT VilBERT [13], LXMERT [14] , UNITER [15], Oscar [16], MPC [25], Semantic VLBERT [88] Source: Own elaboration. The next step in the VQA model is to extract question features.…”
Section: Methods Papermentioning
confidence: 99%
“…Most of the recent work uses transformers for question featurization. [25,26] utilized transformer BERT for language feature extraction. In [87] authors concatenated the output of four consecutive BERT layers in order to generate hierarchical features from the question.…”
Section: Methods Papermentioning
confidence: 99%
See 1 more Smart Citation
“…In [26] Fukui et al presented a novel joint representation technique called MCB for VQA where image and text representation is randomly projected to a higher dimensional space and then join both vectors efficiently by using the element-wise product in Fast Fourier Transform (FFT) space. In [27] authors have used a transformer as a language model and proposed a VQA model with Multiple Parallel Co-attention. They proved that transformers as a language model improve question feature extraction.…”
Section: Related Workmentioning
confidence: 99%