Projection techniques are frequently used as the principal means for the implementation of feature extraction and dimensionality reduction for machine learning applications. A well established and broad class of such projection techniques is the projection pursuit (PP). Its core design parameter is a projection index, which is the driving force in obtaining the transformation function via optimization, and represents in an explicit or implicit way the user's perception of the useful information contained within the datasets. This paper seeks to address the problem related to the design of PP index functions for the linear feature extraction case. We achieve this using an evolutionary search framework, capable of building new indices to fit the properties of the available datasets. The high expressive power of this framework is sustained by a rich set of function primitives. The performance of several PP indices previously proposed by human experts is compared with these automatically generated indices for the task of classification, and results show a decrease in the classification errors.
The operation of instance-based learning algorithms is based on storing a large set of prototypes in the system's database. However, such systems often experience issues with storage requirements, sensitivity to noise, and computational complexity, which result in high search and response times. In this brief, we introduce a novel framework that employs spectral graph theory to efficiently partition the dataset to border and internal instances. This is achieved by using a diverse set of border-discriminating features that capture the local friend and enemy profiles of the samples. The fused information from these features is then used via graph-cut modeling approach to generate the final dataset partitions of border and nonborder samples. The proposed method is referred to as the spectral instance reduction (SIR) algorithm. Experiments with a large number of datasets show that SIR performs competitively compared to many other reduction algorithms, in terms of both objectives of classification accuracy and data condensation.
This paper proposes a novel way for generating reliable low-dimensional features with improved class separability in a kernel-induced feature space. The feature projections rely on a very efficient sequential projection pursuit method, adapted to support nonlinear projections using a new kernel matrix update scheme. This enables the gradual removal of structure from the space of residual dimensions to allow the recovery of multiple projections. An adaptive kernel function is employed to unfold different types of data characteristics. We follow a holistic model selection procedure that, together with the optimal projections, dimensionality, and kernel parameters, additionally optimizes symbolically the projection index that controls the actual measurement of the data interestingness without user interaction. We tackle the underlying complex bi-level optimization model as a mixture of evolutionary and gradient search. The effectiveness of the proposed algorithm over existing approaches is demonstrated with benchmark evaluations and comparisons.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.