CARETS: A Consistency And Robustness Evaluative Test Suite for VQA

Jiménez, Carlos; Russakovsky, Olga; Narasimhan, Karthik

doi:10.48550/arxiv.2203.07613

Cited by 1 publication

(1 citation statement)

References 17 publications

(34 reference statements)

Supporting

Mentioning

Contrasting

Order By: Relevance

“…VALSE (Parcalabescu et al, 2021) is proposed to test VLP models centered on linguistic phenomena. CARET (Jimenez et al, 2022) is proposed to systematically measure consistency and robustness of modern VQA models through six fine-grained capability tests.…”

Section: Robustness and Probing Analysismentioning

confidence: 99%

Vision-Language Pre-training: Basics, Recent Advances, and Future Trends

Gan¹,

Fu²,

Li³

et al. 2022

Preprint

View full text Add to dashboard Cite

This paper surveys vision-language pre-training (VLP) methods for multimodal intelligence that have been developed in the last few years. We group these approaches into three categories: (i) VLP for image-text tasks, such as image captioning, image-text retrieval, visual question answering, and visual grounding; (ii) VLP for core computer vision tasks, such as (open-set) image classification, object detection, and segmentation; and (iii) VLP for video-text tasks, such as video captioning, video-text retrieval, and video question answering. For each category, we present a comprehensive review of state-of-the-art methods, and discuss the progress that has been made and challenges still being faced, using specific systems and models as case studies. In addition, for each category, we discuss advanced topics being actively explored in the research community, such as big foundation models, unified modeling, in-context few-shot learning, knowledge, robustness, and computer vision in the wild, to name a few.♠ Zhe Gan and Jianfeng Gao initiated the project. Zhe Gan and Linjie Li took lead in the writing of Chapter 1. Linjie Li and Jianfeng Gao took lead in the writing of Chapter 2. Zhe Gan further took lead in the writing of Chapter 3 and 7. Chunyuan Li took lead in the writing of Chapter 4. Linjie Li further took lead in the writing of Chapter 5. Lijuan Wang and Zicheng Liu took lead in the writing of Chapter 6. All the authors provided project advice, and contributed to paper editing and proofreading.

show abstract

Section: Robustness and Probing Analysismentioning

confidence: 99%