2022
DOI: 10.48550/arxiv.2203.07613
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

CARETS: A Consistency And Robustness Evaluative Test Suite for VQA

Abstract: We introduce CARETS, a systematic test suite to measure consistency and robustness of modern VQA models through a series of six fine-grained capability tests. In contrast to existing VQA test sets, CARETS features balanced question generation to create pairs of instances to test models, with each pair focusing on a specific capability such as rephrasing, logical symmetry or image obfuscation. We evaluate six modern VQA systems on CARETS and identify several actionable weaknesses in model comprehension, especia… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 17 publications
(34 reference statements)
0
1
0
Order By: Relevance
“…VALSE (Parcalabescu et al, 2021) is proposed to test VLP models centered on linguistic phenomena. CARET (Jimenez et al, 2022) is proposed to systematically measure consistency and robustness of modern VQA models through six fine-grained capability tests.…”
Section: Robustness and Probing Analysismentioning
confidence: 99%
“…VALSE (Parcalabescu et al, 2021) is proposed to test VLP models centered on linguistic phenomena. CARET (Jimenez et al, 2022) is proposed to systematically measure consistency and robustness of modern VQA models through six fine-grained capability tests.…”
Section: Robustness and Probing Analysismentioning
confidence: 99%