Probability estimation for large-margin classifiers

Wang, Junhui; Shen, Xiaotong; Liu, Yufeng

doi:10.1093/biomet/asm077

Cited by 67 publications

(121 citation statements)

References 26 publications

Supporting

Mentioning

120

Contrasting

Order By: Relevance

“…There is a direct analogy between hard- and soft-margin classifiers wherein hard-margin classifiers directly estimate the decision boundary and soft-margin classifiers back-out the decision boundary through conditional class probabilities. When the class probabilities are complex, hard-margin classifiers may lead to improved performance (Wang et al, 2008); likewise, when the conditional expectations are complex, directly targeting the decision rule is likely to yield improved performance. Since the proposed methods do not model the relationship between outcomes and DTRs, they may be more robust to model misspecification than statistical modeling alternatives such as Q -learning (Zhang et al, 2012b,a).…”

Section: Discussionmentioning

confidence: 99%

New Statistical Learning Methods for Estimating Optimal Dynamic Treatment Regimes

Zhao

Zeng

Laber

et al. 2015

Journal of the American Statistical Association

226

247

View full text Add to dashboard Cite

Dynamic treatment regimes (DTRs) are sequential decision rules for individual patients that can adapt over time to an evolving illness. The goal is to accommodate heterogeneity among patients and find the DTR which will produce the best long term outcome if implemented. We introduce two new statistical learning methods for estimating the optimal DTR, termed backward outcome weighted learning (BOWL), and simultaneous outcome weighted learning (SOWL). These approaches convert individualized treatment selection into an either sequential or simultaneous classification problem, and can thus be applied by modifying existing machine learning techniques. The proposed methods are based on directly maximizing over all DTRs a nonparametric estimator of the expected long-term outcome; this is fundamentally different than regression-based methods, for example Q-learning, which indirectly attempt such maximization and rely heavily on the correctness of postulated regression models. We prove that the resulting rules are consistent, and provide finite sample bounds for the errors using the estimated rules. Simulation results suggest the proposed methods produce superior DTRs compared with Q-learning especially in small samples. We illustrate the methods using data from a clinical trial for smoking cessation.

show abstract

Section: Discussionmentioning

confidence: 99%

New Statistical Learning Methods for Estimating Optimal Dynamic Treatment Regimes

Zhao

Zeng

Laber

et al. 2015

Journal of the American Statistical Association

226

247

View full text Add to dashboard Cite

show abstract

“…Thus motivated, Lin et al (2004) proposed weighted SVM (WSVM) by weighting observations from different classes with different weights in the training process and established its Fisher consistency. Based on the WSVM's Fisher consistency, Wang et al (2008) proposed a probability estimation scheme to estimate the conditional probability for each new observation belonging to each class. Particularly inspired from this probability estimation scheme using the WSVM to address the aforementioned homogeneity issue in partitioning a binary response, Shin et al (2014) proposed a probability-enhanced dimension reduction method for multivariate data.…”

Section: Support Vector Machinementioning

confidence: 99%

Probability-enhanced effective dimension reduction for classifying sparse functional data

Yao

Zou

2016

TEST

View full text Add to dashboard Cite

We consider the classification of sparse functional data that are often encountered in longitudinal studies and other scientific experiments. To utilize the information from not only the functional trajectories but also the observed class labels, we propose a probability-enhanced method achieved by weighted support vector machine based on its Fisher consistency property to estimate the effective dimension reduction space. Since only a few measurements are available for some, even all, individuals, a cumulative slicing approach is suggested to borrow information across individuals. We provide justification for validity of the probability-based effective dimension reduction space, and a straightforward implementation that yields a lowdimensional projection space ready for applying standard classifiers. The empirical performance is illustrated through simulated and real examples, particularly in contrast to classification results based on the prominent functional principal component analysis.

show abstract

“…By using different weights for observations in different classes, Wang, Shen, and Liu (2008) proposed to solve the weighted SVM where 0 < 7r < 1 and f(-) can be either linear or nonlinear with an appropriately chosen penalty J(1). For the weighted SVM (6.2), its classification boundary is shown to consistently estimate the boundary {x : p(x) = 7r}.…”

Section: Probability Estimation For Binary Hard Classifiersmentioning

confidence: 99%

“…But the information is limited in the sense that it is only capable of telling whether the conditional probability is larger than 7r. Wang, Shen, and Liu (2008) proposed to solve weighted SVMs for different weights 7r E (0,1). Then an interval estimate for the conditional class probability is obtained.…”

Section: Probability Estimation For Binary Hard Classifiersmentioning

confidence: 99%

Flexible Large Margin Classifiers

Liu¹,

Wu²

2010

High-Dimensional Data Analysis

View full text Add to dashboard Cite

Classification is an important tool for statistical analysis. Among numerous classification methods, margin-based techniques have attracted a lot of attention due to its competitive performance and ability in handling complex and high dimensional data. In this chapter, we review some recent advances of large margin classifiers. We start with the Support Vector Machine in terms of margin and maximum separation. Then we view the SVM in the regularization framework and compare it with several other existing classifiers. Recent extensions of the SVM such as 'Ij!-learning, robust SVM (RSVM), bounded constraint machine (BCM), and balancing SVM (BSVM) are discussed. Issues on multicategory classification and various extensions of binary classifiers to the multicategory case are explored. Finally, issues on hard classifiers and the corresponding class probability estimation problem are briefly mentioned.

show abstract

Probability estimation for large-margin classifiers

Cited by 67 publications

References 26 publications

New Statistical Learning Methods for Estimating Optimal Dynamic Treatment Regimes

New Statistical Learning Methods for Estimating Optimal Dynamic Treatment Regimes

Probability-enhanced effective dimension reduction for classifying sparse functional data

Flexible Large Margin Classifiers

Contact Info

Product

Resources

About