2020
DOI: 10.1145/3433164
|View full text |Cite
|
Sign up to set email alerts
|

A Critical Reassessment of the Saerens-Latinne-Decaestecker Algorithm for Posterior Probability Adjustment

Abstract: We critically re-examine the Saerens-Latinne-Decaestecker (SLD) algorithm, a well-known method for estimating class prior probabilities (“priors”) and adjusting posterior probabilities (“posteriors”) in scenarios characterized by distribution shift, i.e., difference in the distribution of the priors between the training and the unlabelled documents. Given a machine learned classifier and a set of unlabelled documents for which the classifier has returned posterior probabilities and estimates of the prior proba… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
5
1

Relationship

2
4

Authors

Journals

citations
Cited by 17 publications
(4 citation statements)
references
References 35 publications
0
4
0
Order By: Relevance
“…HDy seeks for the mixture parameter α ∈ [0, 1] that minimizes the HD between (a) the mixture distribution of posteriors from the positive class (weighted by α) and from the negative class (weighted by (1 − α)), and (b) the unlabelled distribution. -The Saerens-Latinne-Decaestecker algorithm (SLD) [42] (see also [11]):…”
Section: Baselinesmentioning
confidence: 99%
See 1 more Smart Citation
“…HDy seeks for the mixture parameter α ∈ [0, 1] that minimizes the HD between (a) the mixture distribution of posteriors from the positive class (weighted by α) and from the negative class (weighted by (1 − α)), and (b) the unlabelled distribution. -The Saerens-Latinne-Decaestecker algorithm (SLD) [42] (see also [11]):…”
Section: Baselinesmentioning
confidence: 99%
“…This is a method based on Expectation Maximization, whereby the posterior probabilities returned by a soft classifier s for data items in an unlabelled set U , and the class prevalence values for U , are iteratively updated in a mutually recursive fashion. For SLD we calibrate the classifier since, for reasons discussed in [11], this yields an advantage for this method. 5 -QuaNet [12]: This is a deep learning architecture for quantification that predicts class prevalence values by taking as input (i) the class prevalence values as estimated by CC, ACC, PCC, PACC, SLD; (ii) the posterior probabilities Pr(y|x) for the positive class (since QuaNet is a binary method) for each document x, and (iii) embedded representations of the documents.…”
Section: Baselinesmentioning
confidence: 99%
“…The Saerens-Latinne-Decaestecker (SLD) algorithm [28,7] (sometimes also called EMQ, for Expectation Maximization Quantifier) is a probabilistic quantifier-generating method. SLD consists of using the well-known Expectation Maximization algorithm to iteratively update the posterior probabilities generated by a probabilistic classifier and the class prevalence estimates obtained via maximum-likelihood estimation, in a mutually recursive way, until convergence.…”
Section: The Saerens-latinne-decaestecker Algorithmmentioning
confidence: 99%
“…The quantifiers based on Explicit Loss Minimization (ELM) represent a family of methods based on structured output learning; these quantifiers rely on classifiers that have been optimized using a quantification-oriented loss measure. QuaPy implements the following ELM-based methods, all relying on Joachims' SVM perf structured output learning algorithm [18]: 7 • SVM(Q), which attempts to minimize the Q loss, that combines a classification-oriented loss and a quantification-oriented loss, as proposed in [1];…”
Section: Quantifiers Based On Explicit Loss Minimizationmentioning
confidence: 99%