2019
DOI: 10.1007/s10791-019-09363-y
|View full text |Cite
|
Sign up to set email alerts
|

Evaluation measures for quantification: an axiomatic approach

Abstract: Quantification is the task of estimating, given a set σ of unlabelled items and a set of classes C = {c 1 , . . . , c |C| }, the prevalence (or "relative frequency") in σ of each class c i ∈ C. While quantification may in principle be solved by classifying each item in σ and counting how many such items have been labelled with c i , it has long been shown that this "classify and count" (CC) method yields suboptimal quantification accuracy. As a result, quantification is no longer considered a mere byproduct of… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
25
0
1

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2
2

Relationship

3
6

Authors

Journals

citations
Cited by 61 publications
(26 citation statements)
references
References 56 publications
0
25
0
1
Order By: Relevance
“…For example, the same difference, in absolute value, between the true and the predicted prevalence values may have a different "cost" depending on the original true prevalence value: predicting 0.5 prevalence when the true prevalence is 0.49 can be considered, in some application contexts, a less blatant error than predicting a prevalence of 0.01 when the true prevalence is 0.00. In some other application contexts, though, the two above-mentioned estimation errors may be considered equally serious [29]. This means that sometimes we may want to use a certain evaluation measure and some other times we may want to use a different one.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, the same difference, in absolute value, between the true and the predicted prevalence values may have a different "cost" depending on the original true prevalence value: predicting 0.5 prevalence when the true prevalence is 0.49 can be considered, in some application contexts, a less blatant error than predicting a prevalence of 0.01 when the true prevalence is 0.00. In some other application contexts, though, the two above-mentioned estimation errors may be considered equally serious [29]. This means that sometimes we may want to use a certain evaluation measure and some other times we may want to use a different one.…”
Section: Discussionmentioning
confidence: 99%
“…Several error measures have been proposed in the literature [29], and QuaPy implements a rich set of them:…”
Section: Error Measuresmentioning
confidence: 99%
“…Even if not specifically focused on scales and their relationship to IR evaluation measures, there is a bulk of research on studying which constraints define the core properties of evaluation measures: Amigó et al [6,7,8,9] and Sebastiani [99] face this issue from a formal and theoretical point of view, applying it to various tasks such as ranking, filtering, diversity and quantification, while Moffat [77] adopts a more numerical approach.…”
Section: Related Workmentioning
confidence: 99%
“…where p U andp U indicate the true class distribution and the predicted class distribution, resp., on the set U of unlabelled documents. The reason we use NAE is that, besides its simplicity, it is also (as argued in [35]) one of the theoretically most satisfying measures for evaluating the quality of class priors; NAE ranges between 0 (best) and 1 (worst). In all the tables of results that we include in Section 4, we compare the estimates of the class priors before applying SLD, computed by "classifying and counting", i.e., aŝ…”
Section: Evaluation Measuresmentioning
confidence: 99%