Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-short.10
|View full text |Cite
|
Sign up to set email alerts
|

Are VQA Systems RAD? Measuring Robustness to Augmented Data with Focused Interventions

Abstract: Deep learning algorithms have shown promising results in visual question answering (VQA) tasks, but a more careful look reveals that they often do not understand the rich signal they are being fed with. To understand and better measure the generalization capabilities of VQA systems, we look at their robustness to counterfactually augmented data. Our proposed augmentations are designed to make a focused intervention on a specific property of the question such that the answer changes. Using these augmentations, … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 7 publications
(6 citation statements)
references
References 26 publications
0
6
0
Order By: Relevance
“…[Gan and Ng 2019] further point out that the main deficiency in the data augmentation based methods is that the adversarial examples created are unnatural and not expected to be present in real world. [Rosenberg et al 2021] draw similar conclusion that data augmentation based methods cannot address the robustness issues effectively.…”
Section: Related Workmentioning
confidence: 80%
“…[Gan and Ng 2019] further point out that the main deficiency in the data augmentation based methods is that the adversarial examples created are unnatural and not expected to be present in real world. [Rosenberg et al 2021] draw similar conclusion that data augmentation based methods cannot address the robustness issues effectively.…”
Section: Related Workmentioning
confidence: 80%
“…Such a controlled setting is similar to the randomized experiment described in § 2, where it is possible to compute the difference between an actual text and what the text would have been had a specific concept not existed in it. Indeed, in cases where counterfactual texts can be generated, we can often estimate causal effects on textbased models (Ribeiro et al, 2020;Gardner et al, 2020;Rosenberg et al, 2021;Ross et al, 2021;Meng et al, 2022;Zhang et al, 2022). However, generating such counterfactuals is challenging (see § 4.1.1).…”
Section: Causal Model Interpretationsmentioning
confidence: 99%
“…However, it is non-trivial to remove biases from this type of data. Indeed, it was reported that the question solely is sufficient to detect the correct answer [18,19], i.e., no image information is required. In an attempt to fix this bias, the dataset was re-annotated [20], or the train and test split were reorganized [21].…”
Section: Related Workmentioning
confidence: 99%