2017
DOI: 10.1145/3086701
|View full text |Cite
|
Sign up to set email alerts
|

Using Replicates in Information Retrieval Evaluation

Abstract: This article explores a method for more accurately estimating the main effect of the system in a typical test-collection-based evaluation of information retrieval systems, thus increasing the sensitivity of system comparisons. Randomly partitioning the test document collection allows for multiple tests of a given system and topic (replicates). Bootstrap ANOVA can use these replicates to extract system-topic interactions—something not possible without replicates—yielding a more precise value for the system effe… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
28
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
2

Relationship

1
6

Authors

Journals

citations
Cited by 49 publications
(30 citation statements)
references
References 34 publications
2
28
0
Order By: Relevance
“…ANOVA is a statistical method which is used to check if the means of two or more groups are significantly different from each other. It was widely used in the 1990s to explore the TREC IR runs results [21,25] and has recently been revived [12,24]. For a thorough understanding of the ANOVA, we would refer the readers to Miller's book [15] or Ferro et al [12].…”
Section: Data Analysis Objectives and Methodsmentioning
confidence: 99%
“…ANOVA is a statistical method which is used to check if the means of two or more groups are significantly different from each other. It was widely used in the 1990s to explore the TREC IR runs results [21,25] and has recently been revived [12,24]. For a thorough understanding of the ANOVA, we would refer the readers to Miller's book [15] or Ferro et al [12].…”
Section: Data Analysis Objectives and Methodsmentioning
confidence: 99%
“…More recently, Voorhees et al [60] conducted experiments where the researchers randomly split TREC collections into shards, thus creating more replicates for each (topic, system) pair and allowing them to examine topic*system interactions. By modeling the interactions, Voorhees et al were able to measure more significant differences between retrieval systems.…”
Section: Anovamentioning
confidence: 99%
“…Such statistical processes model scores as a combination of factors and factor interactions. The models were extended to include a topic*system interaction [6,45,60].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, [16] used simulation based on distributions of relevant and not relevant documents to demonstrate the importance of the Topic*System interaction effect. Very recently, [19] exploited random partitions of the document corpus to obtain more replicates of each (topic, system) pair, obtaining an estimation of the Topic*System interaction effect which allowed for improved precision in determining the System effect.…”
Section: Performance Factor Analysis In Irmentioning
confidence: 99%