Abstract-Quantification is the name given to a novel machine learning task which deals with correctly estimating the number of elements of one class in a set of examples. The output of a quantifier is a real value; since training instances are the same as a classification problem, a natural approach is to train a classifier and to derive a quantifier from it. Some previous works have shown that just classifying the instances and counting the examples belonging to the class of interest (classify & count) typically yields bad quantifiers, especially when the class distribution may vary between training and test. Hence, adjusted versions of classify & count have been developed by using modified thresholds. However, previous works have explicitly discarded (without a deep analysis) any possible approach based on the probability estimations of the classifier. In this paper, we present a method based on averaging the probability estimations of a classifier with a very simple scaling that does perform reasonably well, showing that probability estimators for quantification capture a richer view of the problem than methods based on a threshold.
Evaluation of machine learning methods is a crucial step before application, because it is essential to assess how good a model will behave for every single case. In many real applications, not only the "total" or the "average" of the error of the model is important but it is also important to know how this error is distributed or how well confidence or probability estimations are made. However, many machine learning techniques are good in overall results but have a bad distribution /assessment of the error.In these cases, calibration techniques have been developed as postprocessing techniques which aim at improving the probability estimation or the error distribution of an existing model.In this chapter, we present the most usual calibration techniques and calibration measures. We cover both classification and regression, and we establish a taxonomy of calibration techniques, while then paying special attention to probabilistic classifier calibration.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.