Peter L. Bartlett scite author profile

We propose a new class of support vector algorithms for regression and classification. In these algorithms, a parameter nu lets one effectively control the number of support vectors. While this can be useful in its own right, the parameterization has the additional benefit of enabling us to eliminate one of the other free parameters of the algorithm: the accuracy parameter epsilon in the regression case, and the regularization constant C in the classification case. We describe the algorithms, give some theoretical results concerning the meaning and the choice of nu, and report experimental results.

show abstract

Rademacher and Gaussian Complexities: Risk Bounds and Structural Results

Bartlett¹,

Mendelson

2001

883

1,273

View full text Add to dashboard Cite

We investigate the use of certain data-dependent estimates of the complexity of a function class, called Rademacher and gaussian complexities. In a decision theoretic setting, we prove general risk bounds in terms of these complexities. We consider function classes that can be expressed as combinations of functions from basis classes and show how the Rademacher and gaussian complexities of such a function class can be bounded in terms of the complexity of the basis classes. We give examples of the application of these techniques in finding data-dependent risk bounds for decision trees, neural networks and support vector machines.

show abstract

Boosting the margin: a new explanation for the effectiveness of voting methods

Schapire¹,

Freund²,

et al. 1998

View full text Add to dashboard Cite

Abstract. One of the surprising recurring phenomena observed in experiments with boosting is that the test error of the generated classifier usually does not increase as its size becomes very large, and often is observed to decrease even after the training error reaches zero. In this paper, we show that this phenomenon is related to the distribution of margins of the training examples with respect to the generated voting classification rule, where the margin of an example is simply the difference between the number of correct votes and the maximum number of votes received by any incorrect label. We show that techniques used in the analysis of Vapnik's support vector classifiers and of neural networks with small weights can be applied to voting methods to relate the margin distribution to the test error. We also show theoretically and experimentally that boosting is especially effective at increasing the margins of the training examples. Finally, we compare our explanation to those based on the bias-variance decomposition.

show abstract

Convexity, Classification, and Risk Bounds

Bartlett

Jordan

McAuliffe

2006

Journal of the American Statistical Association

772

907

View full text Add to dashboard Cite

Many of the classification algorithms developed in the machine learning literature, including the support vector machine and boosting, can be viewed as minimum contrast methods that minimize a convex surrogate of the 0-1 loss function. The convexity makes these algorithms computationally efficient. The use of a surrogate, however, has statistical consequences that must be balanced against the computational virtues of convexity. To study these issues, we provide a general quantitative relationship between the risk as assessed using the 0-1 loss and the risk as assessed using any nonnegative surrogate loss function. We show that this relationship gives nontrivial upper bounds on excess risk under the weakest possible condition on the loss function: that it satisfy a pointwise form of Fisher consistency for classification. The relationship is based on a simple variational transformation of the loss function that is easy to compute in many applications. We also present a refined version of this result in the case of low noise. Finally, we present applications of our results to the estimation of convergence rates in the general setting of function classes that are scaled convex hulls of a finite-dimensional base class, with a variety of commonly used loss functions.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Peter L. Bartlett

New Support Vector Algorithms

Rademacher and Gaussian Complexities: Risk Bounds and Structural Results

Boosting the margin: a new explanation for the effectiveness of voting methods

Convexity, Classification, and Risk Bounds

Contact Info

Product

Resources

About