Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval 2011
DOI: 10.1145/2009916.2010129
|View full text |Cite
|
Sign up to set email alerts
|

Learning to rank from a noisy crowd

Abstract: We study how to best use crowdsourced relevance judgments learning to rank [1,7]. We integrate two lines of prior work: unreliable crowd-based binary annotation for binary classification [5, 3] and aggregating graded relevance judgments from reliable experts for ranking [7]. To model varying performance of the crowd, we simulate annotation noise with varying magnitude and distributional properties. Evaluation on three LETOR test collections reveals a striking trend contrary to prior studies: single labeling ou… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2011
2011
2020
2020

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 13 publications
(12 citation statements)
references
References 5 publications
0
12
0
Order By: Relevance
“…A variety of methods have been proposed to assess the quality of judgments from turkers. Kumar and Lease [23,24] presented a weighted voting method based on turkers’ accuracies, which can be estimated by taking the full set of labels into account. Jung and Lease [25] conducted a large-scale consensus study on relevant judgements between query/document pairs for Web search on the ClueWeb09 dataset [26].…”
Section: Introductionmentioning
confidence: 99%
“…A variety of methods have been proposed to assess the quality of judgments from turkers. Kumar and Lease [23,24] presented a weighted voting method based on turkers’ accuracies, which can be estimated by taking the full set of labels into account. Jung and Lease [25] conducted a large-scale consensus study on relevant judgements between query/document pairs for Web search on the ClueWeb09 dataset [26].…”
Section: Introductionmentioning
confidence: 99%
“…However, this assumption is rarely satisfied in realworld datasets. The accuracy levels of different users are considered in (Kumar and Lease 2011), which assumes that each user is correct with a certain probability and studies the problem via simulation methods such as naive Bayes and majority voting. In their pioneering work, (Chen et al 2013) studied rank aggregation in a crowd-sourcing environment for pairwise comparisons, modeled via the BTL or TCV model, where noisy BTL comparisons are assumed to be further corrupted.…”
Section: Additional Related Workmentioning
confidence: 99%
“…We present a generalization of Thurstone's model, called the heterogeneous Thurstone model (HTM), which allows users with different noise levels, as well as a certain class of adversarial users. Unlike previous efforts on rank aggregation for heterogeneous populations such as (Chen et al 2013;Kumar and Lease 2011), the proposed model maintains the generality of Thurstone's framework and thus also extends its special cases such as BTL and PL models. We evaluate the performance of the method using simulated data for different noise distributions.…”
Section: Introductionmentioning
confidence: 99%
“…Previous studies highlighted the potential role of SM indicators to enhance the PMS function in this phase and, particularly, within competitive positioning: constant benchmarking with competitors, including for specific products or services; identification of market or sector trends on SM; simulation of acceptance of products or services through SM channels, suggesting that customers compare different prototypes on SM platforms (Mislove et al, 2010;Bradbury, 2011). It emerged that the main users of this information for planning activities are marketing, R&D and human resources (Leonardi and Barley, 2008;Kumar and Lease, 2011), as they have information about the market situation and customer expectation in real time.…”
Section: Sm Information Usementioning
confidence: 99%