Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval 2006
DOI: 10.1145/1148170.1148239
|View full text |Cite
|
Sign up to set email alerts
|

On ranking the effectiveness of searches

Abstract: There is a growing interest in estimating the effectiveness of search. Two approaches are typically considered: examining the search queries and examining the retrieved document sets. In this paper, we take the latter approach. We use four measures to characterize the retrieved document sets and estimate the quality of search. These measures are (i) the clustering tendency as measured by the Cox-Lewis statistic, (ii) the sensitivity to document perturbation, (iii) the sensitivity to query perturbation and (iv)… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
72
1
1

Year Published

2008
2008
2022
2022

Publication Types

Select...
5
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 63 publications
(76 citation statements)
references
References 8 publications
2
72
1
1
Order By: Relevance
“…This can either be the rank correlation coefficients Spearman's Rho or Kendall's Tau [1] which are applicable in the context of f diff and f perf or the linear correlation coefficient r which has been used to evaluate f norm . Note that in [17,16] two correlation-independent evaluations for f diff have been proposed. Since the focus of this paper is on f norm , we consider the linear correlation coefficient r. Using two examples, we show why it is difficult to interpret the r value and why it is prone to providing misleading conclusions, before addressing the shortcomings of the current methodology and conducting a comprehensive evaluation of QPP methods assuming the task defined by f norm .…”
Section: Evaluation Methodologymentioning
confidence: 99%
See 1 more Smart Citation
“…This can either be the rank correlation coefficients Spearman's Rho or Kendall's Tau [1] which are applicable in the context of f diff and f perf or the linear correlation coefficient r which has been used to evaluate f norm . Note that in [17,16] two correlation-independent evaluations for f diff have been proposed. Since the focus of this paper is on f norm , we consider the linear correlation coefficient r. Using two examples, we show why it is difficult to interpret the r value and why it is prone to providing misleading conclusions, before addressing the shortcomings of the current methodology and conducting a comprehensive evaluation of QPP methods assuming the task defined by f norm .…”
Section: Evaluation Methodologymentioning
confidence: 99%
“…Predicting the retrieval performance or determining the degree of difficulty of a query is a challenging research area which has received a lot attention recently [13,20,19,8,16,5,7]. The aim is to create better methods (predictors) for the task, as a reliable and accurate prediction mechanism would enable the creation of more adaptive and intelligent retrieval systems.…”
Section: Introductionmentioning
confidence: 99%
“…Vinay et al [13] studied the clustering tendency of retrieved documents based on the "cluster hypothesis" [14] and proposed four methods of predicting query performance. These methods primarily attempt to study whether the retrieved documents are a random set of points or a clustered set of points.…”
Section: Category I: Pre-retrieval Predictorsmentioning
confidence: 99%
“…In this paper, we prefer to use Kendall's rank correlation to compare the ranking of queries by average precision to the ranking by CTS of these queries, because lots of previous research of prediction [3,12,13] utilize this non-parametric test, which makes our results of prediction comparable to those results of previous research.…”
Section: Correlation With Average Precisionmentioning
confidence: 99%
“…EA1 collection query hardness [7], topic difficulty [30] EA2 query difficulty [167], topic difficulty [30], query performance prediction [175], precision prediction [52], system query hardness [7], search result quality estimation [40], search effectiveness estimation [145] EA3 performance prediction of "retrievals" [50] EA4 automatic evaluation of retrieval systems [114], ranking retrieval systems without relevance judgments [133], retrieval system ranking estimation EA5 - Table 1 In Table 1.1 we have summarized the expressions for each evaluation aspect as they occur in the literature. Evaluation aspect EA2 has the most diverse set of labels, as it is the most widely evaluated aspect.…”
Section: Definition Of Termsmentioning
confidence: 99%