Finding the best decision boundary for a classification problem involves covariance structures, distance measures, and eigenvectors. This article considers how eigenstructures are an inherent part of the support vector machine (SVM) functional basis that encodes the geometric features of a separating hyperplane. SVM learning capacity involves an eigenvector set that spans the parameter space being learned. The linear SVM has been shown to have insufficient learning capacity when the number of training examples exceeds the dimension of the feature space. For this case, an incomplete eigenvector set spans the observation space. SVM architectures based on insufficient eigenstructures lack sufficient learning capacity for good separating hyperplanes. However, proper regularization ensures that two essential types of 'biases' are encoded within SVM functional mappings: an appropriate set of algebraic (and thus geometric) relationships and a sufficient eigenstructure set.
This article will devise data-driven, mathematical laws that generate optimal, statistical classification systems which achieve minimum error rates for data distributions with unchanging statistics. Thereby, I will design learning machines that minimize the expected risk or probability of misclassification. I will devise a system of fundamental equations of binary classification for a classification system in statistical equilibrium. I will use this system of equations to formulate the problem of learning unknown, linear and quadratic discriminant functions from data as a locus problem, thereby formulating geometric locus methods within a statistical framework. Solving locus problems involves finding equations of curves or surfaces defined by given properties and finding graphs or loci of given equations. I will devise three systems of data-driven, locus equations that generate optimal, statistical classification systems. Each class of learning machines satisfies fundamental statistical laws for a classification system in statistical equilibrium. Thereby, I will formulate three classes of learning machines that are scalable modules for optimal, statistical pattern recognition systems, all of which are capable of performing a wide variety of statistical pattern recognition tasks, where any given Mclass statistical pattern recognition system exhibits optimal generalization performance for an M -class feature space.
Capacity control, the bias/variance dilemma, and learning unknown functions from data, are all concerned with identifying effective and consistent fits of unknown geometric loci to random data points. A geometric locus is a curve or surface formed by points, all of which possess some uniform property. A geometric locus of an algebraic equation is the set of points whose coordinates are solutions of the equation. Any given curve or surface must pass through each point on a specified locus. This paper argues that it is impossible to fit random data points to algebraic equations of partially configured geometric loci that reference arbitrary Cartesian coordinate systems. It also argues that the fundamental curve of a linear decision boundary is actually a principal eigenaxis. It is shown that learning principal eigenaxes of linear decision boundaries involves finding a point of statistical equilibrium for which eigenenergies of principal eigenaxis components are symmetrically balanced with each other. It is demonstrated that learning linear decision boundaries involves strong duality relationships between a statistical eigenlocus of principal eigenaxis components and its algebraic forms, in primal and dual, correlated Hilbert spaces. Locus equations are introduced and developed that describe principal eigen-coordinate systems for lines, planes, and hyperplanes. These equations are used to introduce and develop primal and dual statistical eigenlocus equations of principal eigenaxes of linear decision boundaries. Important generalizations for linear decision boundaries are shown to be encoded within a dual statistical eigenlocus of principal eigenaxis components. Principal eigenaxes of linear decision boundaries are shown to encode Bayes' likelihood ratio for common covariance data and a robust likelihood ratio for all other data.
In the recently published article cited above, an error was found in the caption for Figure 6. The correct caption is published below.
Finding discriminant functions of minimum risk binary classification systems is a novel geometric locus problem-that requires solving a system of fundamental locus equations of binary classification-subject to deep-seated statistical laws. We show that a discriminant function of a minimum risk binary classification system is the solution of a locus equation that represents the geometric locus of the decision boundary of the system, wherein the discriminant function is connected to the decision boundary by an intrinsic eigen-coordinate system in such a manner that the discriminant function is represented by a geometric locus of a novel principal eigenaxis-formed by a dual locus of likelihood components and principal eigenaxis components. We demonstrate that a minimum risk binary classification system acts to jointly minimize its eigenenergy and risk by locating a point of equilibrium wherein critical minimum eigenenergies exhibited by the system are symmetrically concentrated in such a manner that the geometric locus of the novel principal eigenaxis of the system exhibits symmetrical dimensions and densities, such that counteracting and opposing forces and influences of the system are symmetrically balanced with each other-about the geometric center of the locus of the novel principal eigenaxis-whereon the statistical fulcrum of the system is located. Thereby, a minimum risk binary classification system satisfies a state of statistical equilibrium wherein the total allowed eigenenergy and the expected risk exhibited by the system are jointly minimized within the decision space of the system, so that the system exhibits the minimum probability of classification error.Key words. fundamental laws of binary classification, direct problem of the binary classification of random vectors, inverse problem of the binary classification of random vectors, minimum risk classification systems, minimum probability of classification error, minimum expected risk, total allowed eigenenergy, critical minimum eigenenergies, * Regularization methods presented in this paper appeared in WIREs Computational Statistics, 3: 204 -215, 2011.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.