The 41st International ACM SIGIR Conference on Research &Amp; Development in Information Retrieval 2018
DOI: 10.1145/3209978.3210024
|View full text |Cite
|
Sign up to set email alerts
|

An Axiomatic Analysis of Diversity Evaluation Metrics

Abstract: Many evaluation metrics have been defined to evaluate the effectiveness ad-hoc retrieval and search result diversification systems. However, it is often unclear which evaluation metric should be used to analyze the performance of retrieval systems given a specific task. Axiomatic analysis is an informative mechanism to understand the fundamentals of metrics and their suitability for particular scenarios. In this paper, we define a constraint-based axiomatic framework to study the suitability of existing metric… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
38
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
4

Relationship

1
8

Authors

Journals

citations
Cited by 41 publications
(39 citation statements)
references
References 28 publications
1
38
0
Order By: Relevance
“…However, while their results showed that CEM ORD is similar to all of these gold measures, the outcome may differ if we choose a different set of gold measures. Indeed, in the context of evaluating information retrieval evaluation measures, demonstrated that a similar meta-evaluation approach called unanimity (Amigó et al, 2018) depends heavily on the choice of gold measures. Moreover, while Amigó et al (2020) reported that CEM ORD also performs well in terms of consistency of system rankings across different data (which they refer to as "robustness"), experimental details were not provided in their paper.…”
Section: Evaluating Ordinal Classificationmentioning
confidence: 99%
“…However, while their results showed that CEM ORD is similar to all of these gold measures, the outcome may differ if we choose a different set of gold measures. Indeed, in the context of evaluating information retrieval evaluation measures, demonstrated that a similar meta-evaluation approach called unanimity (Amigó et al, 2018) depends heavily on the choice of gold measures. Moreover, while Amigó et al (2020) reported that CEM ORD also performs well in terms of consistency of system rankings across different data (which they refer to as "robustness"), experimental details were not provided in their paper.…”
Section: Evaluating Ordinal Classificationmentioning
confidence: 99%
“…3 presents the EMM standardized performance scores of all metric-based losses except . These overall results show that, in 2 The inclusion of dataset and NSR main effects does not inform the model in any way because of the standardization, but we keep them to follow the hierarchy principle of linear models.…”
Section: Ismentioning
confidence: 92%
“…Although this work is mainly theoretical, we performed a brief experiment comparing OIE against traditional metrics. Here, we use the meta-metric Metric Unanimity (MU) [4]. MU quantifies to what extent a metric is sensitive to quality aspects captured by other existing metrics.…”
Section: Methodsmentioning
confidence: 99%
“…Traditional metrics and Observational Information Effectiveness (OIE), ranked by Metric Unanimity (MU)[4].indicates that the metric satisfies the formal constraint, indicates otherwise.…”
mentioning
confidence: 99%