2021
DOI: 10.48550/arxiv.2104.08676
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Distributed NLI: Learning to Predict Human Opinion Distributions for Language Reasoning

Abstract: We introduce distributed NLI, a new NLU task with a goal to predict the distribution of human judgements for natural language inference. We show that models can capture human judgement distribution by applying additional distribution estimation methods, namely, Monte Carlo (MC) Dropout, Deep Ensemble, Re-Calibration, and Distribution Distillation. All four of these methods substantially outperform the softmax baseline. We show that MC Dropout is able to achieve decent performance without any distribution annot… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 32 publications
0
2
0
Order By: Relevance
“…However, recent trends in NLP have began questioning aggregation, arguing that subjective labels should not be aggregated if multiple opinions are valid. Rather, this line of work ( [38,58]) suggests predicting the distribution of human opinions, rather than the majority vote. One implication that follows is that individual annotator performance becomes more important, since one cannot aggregate away labeling error using a simple majority vote.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…However, recent trends in NLP have began questioning aggregation, arguing that subjective labels should not be aggregated if multiple opinions are valid. Rather, this line of work ( [38,58]) suggests predicting the distribution of human opinions, rather than the majority vote. One implication that follows is that individual annotator performance becomes more important, since one cannot aggregate away labeling error using a simple majority vote.…”
Section: Related Workmentioning
confidence: 99%
“…We evaluate the performance of workers against the ground-truth labels ( §3.3). Majority labels are often computed to mitigate labeling error [52], but recent work has also shown the utility of high-quality individual annotations in order to estimate the distributions of human opinion [58]. The latter is particularly relevant in our setting where workers are labeling often subjective concerns: being able to measure the degrees of concern across individuals is relevant towards reducing vaccine hesitancy.…”
Section: Performance Comparisonmentioning
confidence: 99%