2015
DOI: 10.1007/s10994-015-5491-2
|View full text |Cite
|
Sign up to set email alerts
|

Soft-max boosting

Abstract: The standard multi-class classification risk, based on the binary loss, is rarely directly minimized. This is due to (1) the lack of convexity and (2) the lack of smoothness (and even continuity). The classic approach consists in minimizing instead a convex surrogate. In this paper, we propose to replace the usually considered deterministic decision rule by a stochastic one, which allows obtaining a smooth risk (generalizing the expected binary loss, and more generally the cost-sensitive loss). Practically, th… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
3
0
1

Year Published

2015
2015
2024
2024

Publication Types

Select...
3
1
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(4 citation statements)
references
References 20 publications
0
3
0
1
Order By: Relevance
“…The algorithm constructs an additive expansion of the objective function by minimizing a loss function, a technique that it shares with gradient boosting [39]. In the context of multi-class classification problems, XGBoost employs a variant of gradient boosting called Softmax Boosting to optimize a softmax cross-entropy loss function [40]. The softmax function is utilized to transform the model outputs into a probability distribution over the classes.…”
Section: Multi-class Classification Using Xgboostmentioning
confidence: 99%
“…The algorithm constructs an additive expansion of the objective function by minimizing a loss function, a technique that it shares with gradient boosting [39]. In the context of multi-class classification problems, XGBoost employs a variant of gradient boosting called Softmax Boosting to optimize a softmax cross-entropy loss function [40]. The softmax function is utilized to transform the model outputs into a probability distribution over the classes.…”
Section: Multi-class Classification Using Xgboostmentioning
confidence: 99%
“…These works demonstrate a lack of noise tolerance for boosting and empirical risk minimization based on convex losses, and suggest that any approach based on convex risk minimization will require modification of the loss, [...]" - [9] "For example, the random noise (Long and Servedio 2010) defeats all convex potential boosters [...]" - [24] "Long and Servedio (2010) proved that any convex potential loss is not robust to uniform or symmetric label noise." - [27] "We previously [23] showed that any boosting algorithm that works by stagewise minimization of a convex "potential function" cannot tolerate random classification noise" - [41] "However, the convex loss functions are shown to be prone to mistakes when outliers exist [25]." - [85] "[...] However, Long and Servedio (2010) pointed out that any boosting algorithm with convex loss functions is highly susceptible to a random label noise model."…”
Section: What the Papers Saymentioning
confidence: 99%
“…- [48] "This is as opposed to most boosting algorithms that are highly susceptible to outliers [24]." - [56] "Moreover, in the case of boosting, it has been shown that convex boosters are necessarily sensitive to noise (Long and Servedio 2010 [...]" - [25] "Ostensibly, this result establishes that convex losses are not robust to symmetric label noise, and motivates using non-convex losses [40,31,17,15,30]." - [77] "Interestingly, (Long and Servedio, 2010) established a lower bound against potential-based convex boosting techniques in the presence of RCN."…”
Section: What the Papers Saymentioning
confidence: 99%
“…Par gradient fonctionnel nous entendons dérivée au sens de Fréchet, pour l'espace de Hilbert adéquat, c'est-à-dire l'ensemble des classes d'équivalence des fonctions g ∈ R S×A telles que s ν(s) a g(a, s) 2 soit fini, muni du produit scalaire g 1 , g 2 = s ν(s) a g 1 (s, a)g 2 (s, a). Voir par exemple (Geist, 2015) pour plus de détails concernant ce type d'espace.…”
Section: Connexion à L'itération Conservative De La Politiqueunclassified