Proceedings of the Fourth BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP 2021
DOI: 10.18653/v1/2021.blackboxnlp-1.27
|View full text |Cite
|
Sign up to set email alerts
|

Investigating Negation in Pre-trained Vision-and-language Models

Abstract: Pre-trained vision-and-language models have achieved impressive results on a variety of tasks, including ones that require complex reasoning beyond object recognition. However, little is known about how they achieve these results or what their limitations are. In this paper, we focus on a particular linguistic capability, namely the understanding of negation. We borrow techniques from the analysis of language models to investigate the ability of pretrained vision-and-language models to handle negation. We find… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(6 citation statements)
references
References 17 publications
0
6
0
Order By: Relevance
“…A series of studies (van Miltenburg et al, 2016(van Miltenburg et al, , 2017, investigated negation in Flickr30k image descriptions using a smaller set of negation words compared to our study, comparing the use of negation in English, German, and Dutch, and finding no significant differences. Dobreva and Keller (2021) show that the performance of vision and language models decreases when the text contains negation, but did not show that this decrease is caused by negation-related visual features. Text-only models also have difficulty processing negations (e.g., Ettinger (2020)), and the drop in performance could be due to the text encoder alone.…”
Section: Computational Studiesmentioning
confidence: 79%
See 1 more Smart Citation
“…A series of studies (van Miltenburg et al, 2016(van Miltenburg et al, , 2017, investigated negation in Flickr30k image descriptions using a smaller set of negation words compared to our study, comparing the use of negation in English, German, and Dutch, and finding no significant differences. Dobreva and Keller (2021) show that the performance of vision and language models decreases when the text contains negation, but did not show that this decrease is caused by negation-related visual features. Text-only models also have difficulty processing negations (e.g., Ettinger (2020)), and the drop in performance could be due to the text encoder alone.…”
Section: Computational Studiesmentioning
confidence: 79%
“…Negation words (Neg). We use the list of English negation words composed by Dobreva and Keller (2021), and add the word nope. We translate all words in the English list into the other languages, and verify the resulting lists with a native speaker.…”
Section: We Use Microsoft'smentioning
confidence: 99%
“…Human-level Intelligence Numerous studies have been conducted to examine the linguistic capabilities of pre-trained transformer-based language models, exploring their resemblance to human abilities in various aspects such as syntactic knowledge (Linzen et al, 2016;Gulordava et al, 2018), semantic knowledge (Ettinger, 2020;Kementchedjhieva et al, 2021;Misra et al, 2020), and the integration of semantic and syntactic information (Xu and Chen, 2022). However, there is still limited knowledge about the linguistic capabilities of VLP models in relation to human behavior (Dobreva and Keller, 2021;Cao et al, 2020a). Our study seeks to expand the current understanding by investigating whether the impact of tags with various features on the VQA task is comparable to the effect of distractors in the PWI task.…”
Section: Related Workmentioning
confidence: 99%
“…Numerous studies exploring human-like behavior in models primarily focus on language processing alone, revealing that, even though these models may encounter difficulties in certain specialized areas of language, they can attain significant human-like capabilities across diverse linguistic domains (Ettinger, 2020;Rogers et al, 2021). In this study, we expand the scope of the investigation to the field of vision and language, an area that, to the best of our knowledge, has been relatively unexplored from the perspective of the language community (Dobreva and Keller, 2021;Cao et al, 2020a).…”
Section: Introductionmentioning
confidence: 99%
“…Leveraging a FOIL-like paradigm, subsequent studies revealed that Transformer-based models struggle when dealing with verb arguments (SVO-Probes; Hendricks and Nematzadeh, 2021), negation (Dobreva and Keller, 2021), numbers, spatial relations (VALSE; , and expressions requiring compositional abilities (WinoGround;Thrush et al, 2022;Diwan et al, 2022) or embedding semantically underspecified language (Pezzelle, 2023). Crucially, all this previous work focused on phenomena and tasks that require more than a basic language understanding to be properly mastered.…”
Section: Language Abilities Of Pre-trained Multimodal Modelsmentioning
confidence: 99%