We study a class of algorithms that speed up the training process of support vector machines (SVMs) by returning an approximate SVM. We focus on algorithms that reduce the size of the optimization problem by extracting from the original training dataset a small number of representatives and using these representatives to train an approximate SVM. The main contribution of this paper is a PAC-style generalization bound for the resulting approximate SVM, which provides a learning theoretic justification for using the approximate SVM in practice. The proved bound also generalizes and includes as a special case the generalization bound for the exact SVM, which denotes the SVM given by the original training dataset in this paper.
Keyword:Support Vector Machines, Approximate Solutions, Generalization Bounds, Algorithmic Stability 1 Introduction One challenge in using support vector machines (SVM) [8,28] for problems with a large number of training data, which are common in data mining applications, is the prohibitive computational requirement for training, which involves solving a convex optimization problem. To address this issue, many efficient training algorithms have been proposed.The first class of algorithms attack the optimization problem directly. A commonly used strategy is to solve a series of small optimization problems, where ideas like chunking and decomposition are used [4,17,22], and one noteworthy example is the sequential minimal optimization (SMO) algorithm [24]. Special algorithms have also been developed for SVMs with particular kernels, such the linear kernel [18] and the Gaussian kernel [30]. We denote the SVM given by these algorithms and other algorithms that use the original training dataset directly as the exact SVM.