Feature selection (FS) is an important preprocessing step that has been commonly used in several fields to improve the performance of learning algorithms. In the field of medical data mining, a huge number of features are used in diagnosing disease, but these features have a lot of non-relevant weak correlations and redundant characteristics, which causes a number of problems that adversely affect diagnostic predictive accuracy. Work on FS has grown extensively many fields due to increased demand for methods that can reduce the dimensionality of data by choosing the best subset of features according to specific criteria in order to maximize prediction accuracy and minimize irrelevant features. In recent times, metaheuristics have been preferred over conventional optimization methods for solving FS problems in order to try to obtain an almost optimal solution in a finite time. Metaheuristics are general-purpose algorithms that can be used to solve almost any optimization problem because they generate “appropriate” solutions in a reasonable amount of time, which is especially useful when seeking to solve complicated problems. Many popular implementations have shown the utility of metaheuristics in different ways by contrasting their performance on well-known problems with that of other algorithms or applications .There are many metaheuristic algorithms in the literature such as those based on swarm intelligence, including particle swarm optimizations and ant colony optimization. The major objective of this research is to provide an increased degree of accuracy to resolve FS problems by conducting different experiments using a metaheuristic algorithm, namely the heap-based optimizer algorithm (HBO). The HBO is used with a k-nearest neighbor classifier in a wrapper to improve the FS process. The performance of the proposed method is evaluated and compared against seven approaches in the literature that are applied on nine high-dimensional data sets that contain, a low number of samples and multiple classes. The findings reveal that the HBO decreases the number of features for classification tasks, and is able to achieve high accuracy in two data sets as compared to the other approaches, the BHBO achieved the best convergence speed as compared to the competing methods. It is therefore concluded that the proposed HBO method can be used to optimize the FS process, whether in terms of classification accuracy or selection size.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.