N k=1 α k y k = 0 0 ≤ α k ≤ c, k = 1, ..., N. Note: w and ϕ(x k) are not calculated. • Mercer condition: K(x k , x l) = ϕ(x k) T ϕ(x l) • Obtained classifier: y(x) = sign[ N k=1 α k y k K(x, x k) + b] with α k positive real constants, b real constant, that follow as solution to the QP problem. Non-zero α k are called support values and the corresponding data points are called support vectors. The bias term b follows from KKT conditions. • Some possible kernels K(•, •): K(x, x k) = x T k x (linear SVM) K(x, x k) = (x T k x + 1) d (polynomial SVM of degree d) K(x, x k) = exp{− x − x k 2 2 /σ 2 } (RBF SVM) K(x, x k) = tanh(κ x T k x + θ) (MLP SVM) • In the case of RBF and MLP kernel, the number of hidden units corresponds to the number of support vectors.
No abstract
In this paper, we study the performance of various state-of-the-art classification algorithms applied to eight real-life credit scoring data sets. Some of the data sets originate from major Benelux and UK financial institutions. Different types of classifiers are evaluated and compared. Besides the well-known classification algorithms (eg logistic regression, discriminant analysis, k-nearest neighbour, neural networks and decision trees), this study also investigates the suitability and performance of some recently proposed, advanced kernel-based classification algorithms such as support vector machines and least-squares support vector machines (LS-SVMs). The performance is assessed using the classification accuracy and the area under the receiver operating characteristic curve. Statistically significant performance differences are identified using the appropriate test statistics. It is found that both the LS-SVM and neural network classifiers yield a very good performance, but also simple classifiers such as logistic regression and linear discriminant analysis perform very well for credit scoring.
Abstract. In Support Vector Machines (SVMs), the solution of the classification problem is characterized by a (convex) quadratic programming (QP) problem. In a modified version of SVMs, called Least Squares SVM classifiers (LS-SVMs), a least squares cost function is proposed so as to obtain a linear set of equations in the dual space. While the SVM classifier has a large margin interpretation, the LS-SVM formulation is related in this paper to a ridge regression approach for classification with binary targets and to Fisher's linear discriminant analysis in the feature space. Multiclass categorization problems are represented by a set of binary classifiers using different output coding schemes. While regularization is used to control the effective number of parameters of the LS-SVM classifier, the sparseness property of SVMs is lost due to the choice of the 2-norm. Sparseness can be imposed in a second stage by gradually pruning the support value spectrum and optimizing the hyperparameters during the sparse approximation procedure. In this paper, twenty public domain benchmark datasets are used to evaluate the test set performance of LS-SVM classifiers with linear, polynomial and radial basis function (RBF) kernels. Both the SVM and LS-SVM classifier with RBF kernel in combination with standard cross-validation procedures for hyperparameter selection achieve comparable test set performances. These SVM and LS-SVM performances are consistently very good when compared to a variety of methods described in the literature including decision tree based algorithms, statistical algorithms and instance based learning methods. We show on ten UCI datasets that the LS-SVM sparse approximation procedure can be successfully applied.
The Bayesian evidence framework is applied in this paper to least squares support vector machine (LS-SVM) regression in order to infer nonlinear models for predicting a financial time series and the related volatility. On the first level of inference, a statistical framework is related to the LS-SVM formulation which allows one to include the time-varying volatility of the market by an appropriate choice of several hyper-parameters. The hyper-parameters of the model are inferred on the second level of inference. The inferred hyper-parameters, related to the volatility, are used to construct a volatility model within the evidence framework. Model comparison is performed on the third level of inference in order to automatically tune the parameters of the kernel function and to select the relevant inputs. The LS-SVM formulation allows one to derive analytic expressions in the feature space and practical expressions are obtained in the dual space replacing the inner product by the related kernel function using Mercer's theorem. The one step ahead prediction performances obtained on the prediction of the weekly 90-day T-bill rate and the daily DAX30 closing prices show that significant out of sample sign predictions can be made with respect to the Pesaran-Timmerman test statistic.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.