2022 IEEE 11th Global Conference on Consumer Electronics (GCCE) 2022
DOI: 10.1109/gcce56475.2022.10014385
|View full text |Cite
|
Sign up to set email alerts
|

A Multimodal Interpretable Visual Question Answering Model Introducing Image Caption Processor

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1
1

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 5 publications
0
3
0
Order By: Relevance
“…Although the previous natural language explanation generation model can provide an explanation, it lags behind humans in terms of details and rationality. We used our pervious work [21] for the original explanation generation model in the figure.…”
Section: Figurementioning
confidence: 99%
See 2 more Smart Citations
“…Although the previous natural language explanation generation model can provide an explanation, it lags behind humans in terms of details and rationality. We used our pervious work [21] for the original explanation generation model in the figure.…”
Section: Figurementioning
confidence: 99%
“…In our previous works [21,22], we introduced image captions and caption-based outside knowledge as novel modalities to improve model performance. However, the generated caption in Figure 1 highlights the limitations of our previous works.…”
Section: Figurementioning
confidence: 99%
See 1 more Smart Citation