Proceedings of the 15th ACM International Conference on Information and Knowledge Management - CIKM '06 2006
DOI: 10.1145/1183614.1183773
|View full text |Cite
|
Sign up to set email alerts
|

Retrieval evaluation with incomplete relevance data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
8
0
1

Year Published

2007
2007
2016
2016

Publication Types

Select...
2
2
1

Relationship

1
4

Authors

Journals

citations
Cited by 8 publications
(11 citation statements)
references
References 2 publications
2
8
0
1
Order By: Relevance
“…Figure 1(a) shows that reducing the size of the qrels decreases the value of all measures, except for bpref and RankEff. So far, this is consistent with earlier findings [4] [1]. What might be surprising, however, is that bpref, contrary to earlier finding, does not exhibit a dramatic increase when the qrels are reduced.…”
Section: Incomplete Unbiased Judgementssupporting
confidence: 91%
See 2 more Smart Citations
“…Figure 1(a) shows that reducing the size of the qrels decreases the value of all measures, except for bpref and RankEff. So far, this is consistent with earlier findings [4] [1]. What might be surprising, however, is that bpref, contrary to earlier finding, does not exhibit a dramatic increase when the qrels are reduced.…”
Section: Incomplete Unbiased Judgementssupporting
confidence: 91%
“…Evaluation accuracy with incomplete judgements under a given measure is usually evaluated by selecting a random subset of the judged documents and comparing the ranking produced according to the reduced set of judgements with the ranking produced according to the original judgements (cf. [1], [4], [15]). Such incomplete judgements do not favor any particular system.…”
Section: Related Workmentioning
confidence: 99%
See 1 more Smart Citation
“…The evaluation metric bpref [6] is also bounded from below and the metric RankEff [2] is directly maximized when minimizing the number of mis-ranked document pairs in the ranked list. bpref was designed as a stable performance metric when relevance judgments are incomplete, and RankEff builds upon the bpref measure, taking into account all retrieved non-relevant documents.…”
Section: Bpref Rankeff and Misranked Document Pairsmentioning
confidence: 99%
“…bpref was designed as a stable performance metric when relevance judgments are incomplete, and RankEff builds upon the bpref measure, taking into account all retrieved non-relevant documents. Both measures are known to correlate well with average precision in TREC data [2,9,6] and bpref is currently reported in annual TREC evaluation results [8]. These metrics are defined as:…”
Section: Bpref Rankeff and Misranked Document Pairsmentioning
confidence: 99%