Cèsar Ferri scite author profile

Abstract-Quantification is the name given to a novel machine learning task which deals with correctly estimating the number of elements of one class in a set of examples. The output of a quantifier is a real value; since training instances are the same as a classification problem, a natural approach is to train a classifier and to derive a quantifier from it. Some previous works have shown that just classifying the instances and counting the examples belonging to the class of interest (classify & count) typically yields bad quantifiers, especially when the class distribution may vary between training and test. Hence, adjusted versions of classify & count have been developed by using modified thresholds. However, previous works have explicitly discarded (without a deep analysis) any possible approach based on the probability estimations of the classifier. In this paper, we present a method based on averaging the probability estimations of a classifier with a very simple scaling that does perform reasonably well, showing that probability estimators for quantification capture a richer view of the problem than methods based on a threshold.

show abstract

CRISP-DM Twenty Years Later: From Data Mining Processes to Data Science Trajectories

Martínez-Plumed

Contreras-Ochando

Ferri

et al. 2021

IEEE Trans. Knowl. Data Eng.

239

101

View full text Add to dashboard Cite

Volume under the ROC Surface for Multi-class Problems

2003

View full text Add to dashboard Cite

Receiver Operating Characteristic (ROC) has been successfully applied to classifier problems with two classes. The Area Under the ROC Curve (AUC) has been determined as a better way to evaluate classifiers than predictive accuracy or error. However, the extension of the Area Under the ROC Curve for more than two classes has not been addressed to date, because of the complexity and elusiveness of its precise definition. In this paper, we present the real extension to the Area Under the ROC Curve in the form of the Volume Under the ROC Surface (VUS), showing how to compute the polytope that corresponds to the absence of classifiers (given only by the trivial classifiers), to the best classifier and to whatever set of classifiers. We compare the real VUS with "approximations" or "extensions" of the AUC for more than two classes.

show abstract

Delegating classifiers

Ferri

Flach

Hernández-Orallo

2004

View full text Add to dashboard Cite

A sensible use of classifiers must be based on the estimated reliability of their predictions. A cautious classifier would delegate the difficult or uncertain predictions to other, possibly more specialised, classifiers. In this paper we analyse and develop this idea of delegating classifiers in a systematic way. First, we design a two-step scenario where a first classifier chooses which examples to classify and delegates the difficult examples to train a second classifier. Secondly, we present an iterated scenario involving an arbitrary number of chained classifiers. We compare these scenarios to classical ensemble methods, such as bagging and boosting. We show experimentally that our approach is not far behind these methods in terms of accuracy, but with several advantages: (i) improved efficiency, since each classifier learns from fewer examples than the previous one; (ii) improved comprehensibility, since each classification derives from a single classifier; and (iii) the possibility to simplify the overall multiclassifier by removing the parts that lead to delegation.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Cèsar Ferri

An experimental comparison of performance measures for classification

Quantification via Probability Estimators

CRISP-DM Twenty Years Later: From Data Mining Processes to Data Science Trajectories

Volume under the ROC Surface for Multi-class Problems

Delegating classifiers

Contact Info

Product

Resources

About