2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2019
DOI: 10.1109/cvpr.2019.00681
|View full text |Cite
|
Sign up to set email alerts
|

Cycle-Consistency for Robust Visual Question Answering

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
159
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 149 publications
(159 citation statements)
references
References 30 publications
0
159
0
Order By: Relevance
“…Keeping the visual input unchanged can allow natural language semantic understanding to be better studied. Recent works have done this by rephrasing queries (Shah et al, 2019 ). To some extent, this can be done automatically by merging/negating existing queries, replacing words with synonyms, and introducing distractors.…”
Section: Addressing Shortcomingsmentioning
confidence: 99%
“…Keeping the visual input unchanged can allow natural language semantic understanding to be better studied. Recent works have done this by rephrasing queries (Shah et al, 2019 ). To some extent, this can be done automatically by merging/negating existing queries, replacing words with synonyms, and introducing distractors.…”
Section: Addressing Shortcomingsmentioning
confidence: 99%
“…While useful, these do not take the relationship between predictions into account, and thus do not capture problems like the ones in Figure 1. Exceptions exist when trying to gauge robustness: Ribeiro et al (2018) consider the robustness of QA models to automatically generated input rephrasings, while Shah et al (2019) evaluate VQA models on crowdsourced rephrasings for robustness. While important for evaluation, these efforts are orthogonal to our focus on consistency.…”
Section: Related Workmentioning
confidence: 99%
“…Second, we collect human-annotated QA pairs based on common-sense in addition to the logic-based QA's. The most relevant work to ours is Shah et al (2019). However, they focus strictly on question paraphrases that maintains the same answers as the source question.…”
Section: Related Workmentioning
confidence: 99%
“…Checking for consistency can be considered as an interrogative Turing Test (Radziwill and Benton, 2017) for linguistic robustness (Stede, 1992), . Works such as Xu et al (2018) explore the robustness of VQA with respect to image variations, whereas works such as Ray et al (2016) and Mahendru et al (2017) focus on the understanding of the premise of a question instead of relying on dataset biases (Agrawal et al, 2017) (Goyal et al, 2017) or linguistic biases (Ramakrishnan et al, 2018 Shah et al (2019). However, they focus strictly on question paraphrases that maintains the same answers as the source question.…”
Section: Related Workmentioning
confidence: 99%