Proceedings of the 23rd International Conference on World Wide Web 2014
DOI: 10.1145/2566486.2567989
|View full text |Cite
|
Sign up to set email alerts
|

Community-based bayesian aggregation models for crowdsourcing

Abstract: This paper addresses the problem of extracting accurate labels from crowdsourced datasets, a key challenge in crowdsourcing. Prior work has focused on modeling the reliability of individual workers, for instance, by way of confusion matrices, and using these latent traits to estimate the true labels more accurately. However, this strategy becomes ineffective when there are too few labels per worker to reliably estimate their quality. To mitigate this issue, we propose a novel community-based Bayesian label agg… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
193
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
3
2
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 200 publications
(193 citation statements)
references
References 9 publications
0
193
0
Order By: Relevance
“…However, existing worker models [22], [23], [24], [25] cannot be applied directly in partial-agreement answer aggregation, since worker answers may be partially overlapping. Moreover, interpreting a missing label as a negative answer is not always correct and thus shall be cross-checked by answers from other workers.…”
Section: Motivating Examplementioning
confidence: 99%
“…However, existing worker models [22], [23], [24], [25] cannot be applied directly in partial-agreement answer aggregation, since worker answers may be partially overlapping. Moreover, interpreting a missing label as a negative answer is not always correct and thus shall be cross-checked by answers from other workers.…”
Section: Motivating Examplementioning
confidence: 99%
“…This is a common problem with approaches to inference that use maximum likelihood or maximum a-posteriori solutions [4]. In order to overcome this limitation, algorithms for aggregating crowdsourced data including SFilter [8] and Bayesian Classifier Combination (BCC) [14,30] capture the uncertainty in the workers' skill levels or bias, as well as the uncertainty in the aggregated labels. Unfortunately, these methods do not exploit the text features of documents, and consequently require each document to be labelled by the crowd, often multiple times, to obtain confident classifications.…”
Section: Aggregating Judgementsmentioning
confidence: 99%
“…It learns both the confusion matrices of each community and each worker but, like IBCC, it does not account for text features in the documents [30]. We run CBCC with three communities as suggested in the original paper, for both CF and SP.…”
Section: Independent Bayesian Classifier Combination (Ibcc)mentioning
confidence: 99%
See 1 more Smart Citation
“…Several research works have focused on improving the quality of crowdsourcing results by using a variety of techniques ranging from worker pre-screening methods and effective crowdsourcing task design [42], to using gamification and incentive mechanisms [19,59], and answer aggregation methods [67]. Due to the low entry barrier, crowdsourcing has become truly ubiquitous [68].…”
Section: Introduction "Sometimes the Internet Fee Is Greater Than Thementioning
confidence: 99%