2019 IEEE Winter Conference on Applications of Computer Vision (WACV) 2019
DOI: 10.1109/wacv.2019.00030
|View full text |Cite
|
Sign up to set email alerts
|

Spatial Knowledge Distillation to Aid Visual Reasoning

Abstract: For tasks involving language and vision, the current state-of-the-art methods tend not to leverage any additional information that might be present to gather relevant (commonsense) knowledge. A representative task is Visual Question Answering where large diagnostic datasets have been proposed to test a system's capability of answering questions about images. The training data is often accompanied by annotations of individual object properties and spatial locations. In this work, we take a step towards integrat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
10
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
2
2
2

Relationship

1
5

Authors

Journals

citations
Cited by 14 publications
(10 citation statements)
references
References 20 publications
0
10
0
Order By: Relevance
“…Appropriateness. Authors in [1,30] used relational reasoning for image question answering. They model relationships as functions between objects and use these functions together to answer questions about an image.…”
Section: Discussion: Knowledge Integration In the Deep Learning Eramentioning
confidence: 99%
See 4 more Smart Citations
“…Appropriateness. Authors in [1,30] used relational reasoning for image question answering. They model relationships as functions between objects and use these functions together to answer questions about an image.…”
Section: Discussion: Knowledge Integration In the Deep Learning Eramentioning
confidence: 99%
“…Several researchers employed commonsense knowledge to enrich high-level understanding tasks such as visual ques- Figure 2: (a) Example of questions that require explicit external knowledge [35], (b) Example where knowledge helps [37]. (c) Ways to integrate background knowledge: i) Pre-process knowledge and augment input [1]; ii) Incorporate knowledge as embeddings [36]; iii) Post-processing using explicit reasoning mechanism [2]; iv) Using knowledge graph to influence NN architecture [24]. tion answering, zero-shot object detection, relationship detection.…”
Section: High-level Common-sense Knowledgementioning
confidence: 99%
See 3 more Smart Citations