Microarray datasets play a crucial role in cancer detection. But the high dimension of these datasets makes the classification challenging due to the presence of many irrelevant and redundant features. Hence, feature selection becomes irreplaceable in this field because of its ability to remove the unrequired features from the system. As the task of selecting the optimal number of features is an NP-hard problem, hence, some meta-heuristic search technique helps to cope up with this problem. In this paper, we propose a 2-stage model for feature selection in microarray datasets. The ranking of the genes for the different filter methods are quite diverse and effectiveness of rankings is datasets dependent. First, we develop an ensemble of filter methods by considering the union and intersection of the top-n features of ReliefF, chi-square, and symmetrical uncertainty. This ensemble allows us to combine all the information of the three rankings together in a subset. In the next stage, we use genetic algorithm (GA) on the union and intersection to get the fine-tuned results, and union performs better than the latter. Our model has been shown to be classifier independent through the use of three classifiers-multi-layer perceptron (MLP), support vector machine (SVM), and K-nearest neighbor (K-NN). We have tested our model on five cancer datasets-colon, lung, leukemia, SRBCT, and prostate. Experimental results illustrate the superiority of our model in comparison to state-of-the-art methods. Graphical abstract ᅟ.
Feature selection (FS), an important pre-processing step in the fields of machine learning and data mining, has immense impact on the outcome of the corresponding learning models. Basically, it aims to remove all possible irrelevant as well as redundant features from a feature vector, thereby enhancing the performance of the overall prediction or classification model. Over the years, meta-heuristic optimization techniques have been applied for FS, as these are able to overcome the limitations of traditional optimization approaches. In this work, we introduce a binary variant of the recently-proposed Sailfish Optimizer (SFO), named as Binary Sailfish (BSF) optimizer, to solve FS problems. Sigmoid transfer function is utilized here to map the continuous search space of SFO to a binary one. In order to improve the exploitation ability of the BSF optimizer, we amalgamate another recently proposed meta-heuristic algorithm, namely adaptive β-hill climbing (AβHC) with BSF optimizer. The proposed BSF and AβBSF algorithms are applied on 18 standard UCI datasets and compared with 10 state-of-the-art meta-heuristic FS methods. The results demonstrate the superiority of both BSF and AβBSF algorithms in solving FS problems. The source code of this work is available in https://github.com/Rangerix/MetaheuristicOptimization. INDEX TERMS Binary sailfish optimizer, feature selection, adaptive β-hill climbing, hybrid optimization, UCI dataset.
An exact analytical form of Sagdeev pseudopotential has been derived for a two electron temperature warm ion plasma, from which ion acoustic rarefactive solitary wave solutions could be investigated for a wide range of different plasma parameters, viz., ion temperature (σ), cold to hot electron temperature ratio (β), and initial cold electron concentration (μ). Explicitly large Mach numbers have been obtained for increasing hot to cold electron temperature ratios, and an analytical condition for the upper bound of the Mach number has been derived for such a rarefactive solitary wave. It is found that the width of these waves obey Korteweg–de Vries soliton-type behavior only for small amplitudes (i.e., eφ/Teff<1) while for large amplitudes, the width of the rarefactive solitary waves increases with increasing amplitude.
Feature selection (FS) is mainly used as a pre-processing tool to reduce dimensionality by eliminating irrelevant or redundant features to be used for a machine learning or data mining algorithm. In this paper, we have introduced binary variant of a recently proposed meta-heuristic algorithm called Social Ski Driver (SSD) optimization. To the best of our knowledge, SSD has not been used yet in the domain of FS. Two binary variants of SSD are proposed using S-shaped and V-shaped transfer functions. Besides, the exploitation ability of SSD is improved by using a local search method, called Late Acceptance Hill Climbing (LAHC). The hybrid meta-heuristic is then converted to binary version by using said transfer functions. The proposed methods are applied on 18 standard UCI datasets and compared with 15 stateof-the-art FS methods. Also to check the robustness of the proposed method, we have applied it to 3 high dimensional microarray datasets and compared with 6 state-of-the-art methods. Achieved results confirm the superiority of the proposed methods compared to other meta-heuristic wrapper based FS methods considered here. Source code of this work is available at https://github.com/consigliere19/SSD-LAHC.INDEX TERMS Social ski driver optimization, feature selection, late acceptance hill climbing, UCI dataset, meta-heuristic optimization, microarray data.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.