Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-short.109
|View full text |Cite
|
Sign up to set email alerts
|

Embracing Ambiguity: Shifting the Training Target of NLI Models

Abstract: Natural Language Inference (NLI) datasets contain examples with highly ambiguous labels. While many research works do not pay much attention to this fact, several recent efforts have been made to acknowledge and embrace the existence of ambiguity, such as UNLI and ChaosNLI. In this paper, we explore the option of training directly on the estimated label distribution of the annotators in the NLI task, using a learning loss based on this ambiguity distribution instead of the goldlabels. We prepare AmbiNLI, a tri… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
10
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 8 publications
(10 citation statements)
references
References 8 publications
0
10
0
Order By: Relevance
“…Distributional models Distributional models aim to predict the distribution of annotator judgments. We use two models from prior work: 1) one trained on AmbiNLI (Meissner et al, 2021), with examples with multiple annotations from SNLI (Bowman et al, 2015) and MNLI, and 2) ing distributional labels into discrete ones with a threshold of 0.2. In addition, we train a multilabel model on WANLI's train set (which has two annotations per example), as well as a classifier over sets which performs 7-way classification over the power set of NLI labels, minus the empty set.…”
Section: Regression Modelsmentioning
confidence: 99%
See 1 more Smart Citation
“…Distributional models Distributional models aim to predict the distribution of annotator judgments. We use two models from prior work: 1) one trained on AmbiNLI (Meissner et al, 2021), with examples with multiple annotations from SNLI (Bowman et al, 2015) and MNLI, and 2) ing distributional labels into discrete ones with a threshold of 0.2. In addition, we train a multilabel model on WANLI's train set (which has two annotations per example), as well as a classifier over sets which performs 7-way classification over the power set of NLI labels, minus the empty set.…”
Section: Regression Modelsmentioning
confidence: 99%
“…The AmbiNLI model (Meissner et al, 2021) is first pretrained on single-label data from SNLI + MNLI for 3 epochs, then further finetuned on Am-biNLI for 2 epochs. AmbiNLI examples have distributional outputs, and is sourced from the development set of SNLI and MNLI (which contain 5 labels) and train set of UNLI (which are heuristically mapped to soft labels).…”
Section: D2 Training Detailsmentioning
confidence: 99%
“…However, we may not simply attribute such disagreement to poor annotation quality, since there is inherent ambiguity in the annotations of natural language inference tasks, as is reported by Nie et al (2020). We can still make reasonable probabilistic estimation of the status by embracing the ambiguity and directly learn from the annotation distribution (Meissner et al, 2021). Therefore, we change the learning target from binary labels to the portion of annotators who label the status as uncertain, and the possible values are thus 0, 1/3, 2/3 and 1.…”
Section: Symptom Status Inferencementioning
confidence: 99%
“…However, the tasks concerning conditions under multiple plausible scenarios are few, and their domains are limited to, for example, factual information that differs according to place and time (Zhang and Choi, 2021) or human behaviors that are either normative or divergent (Emelin et al, 2021). Another example is the natural language inference or commonsense reasoning task that considers variations in human opinions (Zhang et al, 2017;Chen et al, 2020b), which allows for the differences in annotations due to one's mentality (Pavlick and Kwiatkowski, 2019;Meissner et al, 2021). Our aim here is to interrogate these types of situated reasoning in more comprehensive settings, such as in story texts.…”
Section: Original Endingmentioning
confidence: 99%