2009
DOI: 10.1007/s11222-009-9125-z
|View full text |Cite
|
Sign up to set email alerts
|

Semi-parametric analysis of multi-rater data

Abstract: Datasets that are subjectively labeled by a number of experts are becoming more common in tasks such as biological text annotation where class definitions are necessarily somewhat subjective. Standard classification and regression models are not suited to multiple labels and typically a preprocessing step (normally assigning the majority class) is performed. We propose Bayesian models for classification and ordinal regression that naturally incorporate multiple expert opinions in defining predictive distributi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
14
0

Year Published

2011
2011
2020
2020

Publication Types

Select...
4
2
2

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(14 citation statements)
references
References 14 publications
0
14
0
Order By: Relevance
“…Model-based gold-standard estimation such as (Dawid and Skene, 1979) has long been the standard in epidemiology, and has been applied to disease prevalence estimation (Albert and Dodd, 2008) and also to many other problems such as human annotation of craters in images of Venus (Smyth et al, 1995). Smyth et al (1995), Rogers et al (2010), andRaykar et al (2010) all discuss the advantages of learning and evaluation with probabilistically annotated corpora. Rzhetsky et al (2009) and Whitehill et al ( 2009) estimate annotation models without gold-standard supervision, but neither models annotator biases, which are critical for estimating true labels.…”
Section: Related Workmentioning
confidence: 99%
“…Model-based gold-standard estimation such as (Dawid and Skene, 1979) has long been the standard in epidemiology, and has been applied to disease prevalence estimation (Albert and Dodd, 2008) and also to many other problems such as human annotation of craters in images of Venus (Smyth et al, 1995). Smyth et al (1995), Rogers et al (2010), andRaykar et al (2010) all discuss the advantages of learning and evaluation with probabilistically annotated corpora. Rzhetsky et al (2009) and Whitehill et al ( 2009) estimate annotation models without gold-standard supervision, but neither models annotator biases, which are critical for estimating true labels.…”
Section: Related Workmentioning
confidence: 99%
“…Application areas include disease prevalence estimation (Albert and Dodd, 2008), identification of craters in images of Venus (Smyth et al, 1995), curation of biological data (Rzhetsky et al, 2009), computer vision (Whitehill et al, 2009), patient history (Dawid and Skene, 1979), and clinical reports (2010). Smyth et al (1995), Rogers et al, and (2010) and Raykar et al (2010) discuss the advantages of probabilistically annotated corpora over majority vote. Much of this work is motivated by the observation that annotators have different accuracies, and the fact that when annotators have known accuracies it can be shown that a majority of inaccurate annotators can be wrong (Raykar et al, 2010;Passonneau and Carpenter, 2014).…”
Section: Related Workmentioning
confidence: 99%
“…In the database and data mining research communities, various models have been proposed [17,21]. According to applications, they are used for discrete labeling [1, 6, 7, 11-13, 16, 20, 26, 32, 33, 39, 41, 42], or continuous labeling [13,18,26,30].…”
Section: Introductionmentioning
confidence: 99%