Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.732
|View full text |Cite
|
Sign up to set email alerts
|

An Analysis of Natural Language Inference Benchmarks through the Lens of Negation

Abstract: Negation is underrepresented in existing natural language inference benchmarks. Additionally, one can often ignore the few negations in existing benchmarks and still make the right inference judgments. In this paper, we present a new benchmark for natural language inference in which negation plays an important role. We also show that state-of-the-art transformers struggle making inference judgments with the new pairs.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
23
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
2

Relationship

1
8

Authors

Journals

citations
Cited by 27 publications
(23 citation statements)
references
References 37 publications
0
23
0
Order By: Relevance
“…The results of "asking" entailment models whether the text is a suggestion or not are shown in Table 4. The difference in performance of all models on test sets of Subtasks A and B can be explained by the differences in target classes distributions; we also note that Transformer-based models often struggle with negations [9]. Similar to [35], using definitions seemingly does not guarantee better results for entailment-based zero-shot text (suggestion) classification.…”
Section: Resultsmentioning
confidence: 92%
See 1 more Smart Citation
“…The results of "asking" entailment models whether the text is a suggestion or not are shown in Table 4. The difference in performance of all models on test sets of Subtasks A and B can be explained by the differences in target classes distributions; we also note that Transformer-based models often struggle with negations [9]. Similar to [35], using definitions seemingly does not guarantee better results for entailment-based zero-shot text (suggestion) classification.…”
Section: Resultsmentioning
confidence: 92%
“…Further, we have selected a list of candidates from those hyponyms to be later mapped to suggestions: direction.n.06, guidance.n.01, offer.n.02, promotion.n.01, proposal.n.01, reminder.n.01, request.n.01, submission.n.01. We have formulated the labels as "This text is a [LEMMA]", where [LEMMA] is the first lemma in the WordNet lemma synset list 9 . We have used the development set of the SemEval2019 Task 9, Subtask A to find the best subset of candidate labels in terms of F1-measure of the "suggestion" class.…”
Section: Approachmentioning
confidence: 99%
“…Pretrained language models fail to make correct predictions in the presence of negation or even to distinguish between positive and negative sentences (Ettinger, 2020;Kassner and Schütze, 2020). Hossain et al (2020) finetune several different pre-trained language models for natural language inference and show that performance on instances containing negation deteriorates.…”
Section: Negationmentioning
confidence: 99%
“…As a preliminary step in the data augmentation process, our in-domain experts rewrote existing responses to improve the balance of the corpus. We used strategy similar to the one used in Hossain et al (2020). We ran statistical and machine learning experiments to ensure that the additional examples do not introduce biases.…”
Section: Data Augmentationmentioning
confidence: 99%