A novel framework, based on the statistical interpretation of boosting, is proposed for the design of cost sensitive boosting algorithms. It is argued that, although predictors produced with boosting converge to the ratio of posterior class probabilities that also appears in Bayes decision rule, this convergence only occurs in a small neighborhood of the optimal cost-insensitive classification boundary. This is due to a combination of the cost-insensitive nature of current boosting losses, and boosting's sample reweighing mechanism. It is then shown that convergence in the neighborhood of a target cost-sensitive boundary can be achieved through boosting-style minimization of extended, cost-sensitive, losses. The framework is applied to the design of specific algorithms, by introduction of cost-sensitive extensions of the exponential and binomial losses. Minimization of these losses leads to cost sensitive extensions of the popular AdaBoost, RealBoost, and LogitBoost algorithms. Experimental validation, on various UCI datasets and the computer vision problem of face detection, shows that the new algorithms substantially improve performance over what was achievable with previous cost-sensitive boosting approaches.Author
A cost-sensitive extension of boosting, denoted as asymmetric boosting, is presented. Unlike previous proposals, the new algorithm is derived from sound decision-theoretic principles, which exploit the statistical interpretation of boosting to determine a principled extension of the boosting loss. Similarly to AdaBoost, the cost-sensitive extension minimizes this loss by gradient descent on the functional space of convex combinations of weak learners, and produces large margin detectors. It is shown that asymmetric boosting is fully compatible with AdaBoost, in the sense that it becomes the latter when errors are weighted equally. Experimental evidence is provided to demonstrate the claims of cost-sensitivity and large margin. The algorithm is also applied to the computer vision problem of face detection, where it is shown to outperform a number of previous heuristic proposals for cost-sensitive boosting (AdaCost, CSB0, CSB1, CSB2, asymmetricAdaBoost, AdaC1, AdaC2 and AdaC3).
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.