In this paper, we study data-dependent generalization error bounds exhibiting a mild dependency on the number of classes, making them suitable for multi-class learning with a large number of label classes. The bounds generally hold for empirical multi-class risk minimization algorithms using an arbitrary norm as regularizer. Key to our analysis are new structural results for multiclass Gaussian complexities and empirical ∞-norm covering numbers, which exploit the Lipschitz continuity of the loss function with respect to the 2-and ∞-norm, respectively. We establish data-dependent error bounds in terms of complexities of a linear function class defined on a finite set induced by training examples, for which we show tight lower and upper bounds. We apply the results to several prominent multi-class learning machines, exhibiting a tighter dependency on the number of classes than the state of the art. For instance, for the multi-class SVM by Crammer and Singer (2002), we obtain a data-dependent bound with a logarithmic dependency which significantly improves the previous square-root dependency. Experimental results are reported to verify the effectiveness of our theoretical findings.
In the presented work we compare machine learning techniques in the context of lane change behavior performed by humans in a semi-naturalistic simulated environment. We evaluate different learning approaches using differing feature combinations in order to identify appropriate feature, best feature combination, and the most appropriate machine learning technique for the described task. Based on the data acquired from human drivers in the traffic simulator NISYS TRS 1 , we trained a recurrent neural network, a feed forward neural network and a set of support vector machines. In the followed test drives the system was able to predict lane changes up to 1.5 sec in beforehand.
We introduce extensions of stability selection, a method to stabilise variable selection methods introduced by Meinshausen and Bhlmann (J R Stat Soc 72:417-473, 2010). We propose to apply a base selection method repeatedly to random subsamples of observations and subsets of covariates under scrutiny, and to select covariates based on their selection frequency. We analyse the effects and benefits of these extensions. Our analysis generalizes the theoretical results of Meinshausen and Bhlmann (J R Stat Soc 72:417-473, 2010) from the case of half-samples to subsamples of arbitrary size. We study, in a theoretical manner, the effect of taking random covariate subsets using a simplified score model. Finally we validate these extensions on numerical experiments on both synthetic and real datasets, and compare the obtained results in detail to the original stability selection method.Keywords variable selection · stability selection · subsampling A preliminary version of this work was presented at the conference DAGM 2012 (Beinrucker et al., 2012b).
Training of one-vs.-rest SVMs can be parallelized over the number of classes in a straight forward way. Given enough computational resources, one-vs.-rest SVMs can thus be trained on data involving a large number of classes. The same cannot be stated, however, for the so-called all-in-one SVMs, which require solving a quadratic program of size quadratically in the number of classes. We develop distributed algorithms for two all-in-one SVM formulations (Lee et al. and Weston and Watkins) that parallelize the computation evenly over the number of classes. This allows us to compare these models to one-vs.-rest SVMs on unprecedented scale. The results indicate superior accuracy on text classification data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.