Binary dwarf mongoose optimizer for solving high-dimensional feature selection problems

Akinola, Olatunji; Agushaka, Jeffrey O.; Ezugwu, Absalom E.

doi:10.1371/journal.pone.0274850

Cited by 21 publications

(12 citation statements)

References 77 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…(b) Error rate: Defined in detail in Section 4.2.1 and mathematically in equation 26. This objective function has been pursued in [42][43][44][45][46][47][48][49][50][51].…”

Section: Bibliometric Analysismentioning

confidence: 99%

Feature Selection Problem and Metaheuristics: A Systematic Literature Review about Its Formulation, Evaluation and Applications

Barrera-García,

Cisternas-Caneo,

Crawford

et al. 2023

Biomimetics

View full text Add to dashboard Cite

Feature selection is becoming a relevant problem within the field of machine learning. The feature selection problem focuses on the selection of the small, necessary, and sufficient subset of features that represent the general set of features, eliminating redundant and irrelevant information. Given the importance of the topic, in recent years there has been a boom in the study of the problem, generating a large number of related investigations. Given this, this work analyzes 161 articles published between 2019 and 2023 (20 April 2023), emphasizing the formulation of the problem and performance measures, and proposing classifications for the objective functions and evaluation metrics. Furthermore, an in-depth description and analysis of metaheuristics, benchmark datasets, and practical real-world applications are presented. Finally, in light of recent advances, this review paper provides future research opportunities.

show abstract

“…(b) Error rate: Defined in detail in Section 4.2.1 and mathematically in equation 26. This objective function has been pursued in [42][43][44][45][46][47][48][49][50][51].…”

Section: Bibliometric Analysismentioning

confidence: 99%

Feature Selection Problem and Metaheuristics: A Systematic Literature Review about Its Formulation, Evaluation and Applications

Barrera-García,

Cisternas-Caneo,

Crawford

et al. 2023

Biomimetics

View full text Add to dashboard Cite

show abstract

“…Approximate algorithms such as metaheuristic algorithms have been used to find an optimal subset out of near-optimal subsets heuristically [18,19]. Just like in other areas of application of metaheuristic algorithms, such as engineering problems [20,21] and scheduling problems [22,23], significant successes have been recorded in the area of FS [24,25]. Emary et al [26] used the wrapper-based method to propose two versions of binary grey wolf optimizer (bGWO) that use the stochastic crossover among the three best solutions and the S-shaped transfer function.…”

Section: Introductionmentioning

confidence: 99%

Evolutionary binary feature selection using adaptive ebola optimization search algorithm for high-dimensional datasets

2023

Self Cite

View full text Add to dashboard Cite

Feature selection problem represents the field of study that requires approximate algorithms to identify discriminative and optimally combined features. The evaluation and suitability of these selected features are often analyzed using classifiers. These features are locked with data increasingly being generated from different sources such as social media, surveillance systems, network applications, and medical records. The high dimensionality of these datasets often impairs the quality of the optimal combination of these features selected. The use of the binary optimization method has been proposed in the literature to address this challenge. However, the underlying deficiency of the single binary optimizer is transferred to the quality of the features selected. Though hybrid methods have been proposed, most still suffer from the inherited design limitation of the single combined methods. To address this, we proposed a novel hybrid binary optimization capable of effectively selecting features from increasingly high-dimensional datasets. The approach used in this study designed a sub-population selective mechanism that dynamically assigns individuals to a 2-level optimization process. The level-1 method first mutates items in the population and then reassigns them to a level-2 optimizer. The selective mechanism determines what sub-population is assigned for the level-2 optimizer based on the exploration and exploitation phase of the level-1 optimizer. In addition, we designed nested transfer (NT) functions and investigated the influence of the function on the level-1 optimizer. The binary Ebola optimization search algorithm (BEOSA) is applied for the level-1 mutation, while the simulated annealing (SA) and firefly (FFA) algorithms are investigated for the level-2 optimizer. The outcome of these are the HBEOSA-SA and HBEOSA-FFA, which are then investigated on the NT, and their corresponding variants HBEOSA-SA-NT and HBEOSA-FFA-NT with no NT applied. The hybrid methods were experimentally tested over high-dimensional datasets to address the challenge of feature selection. A comparative analysis was done on the methods to obtain performance variability with the low-dimensional datasets. Results obtained for classification accuracy for large, medium, and small-scale datasets are 0.995 using HBEOSA-FFA, 0.967 using HBEOSA-FFA-NT, and 0.953 using HBEOSA-FFA, respectively. Fitness and cost values relative to large, medium, and small-scale datasets are 0.066 and 0.934 using HBEOSA-FFA, 0.068 and 0.932 using HBEOSA-FFA, with 0.222 and 0.970 using HBEOSA-SA-NT, respectively. Findings from the study indicate that the HBEOSA-SA, HBEOSA-FFA, HBEOSA-SA-NT and HBEOSA-FFA-NT outperformed the BEOSA.

show abstract

“…Feature/Gene selection in micro-array gene expression datasets has gained great attention during the recent decades [ 1 – 7 ]. Since high dimensional datasets usually contain noisy, redundant and non-informative features that enhance computational complexity as well as execution time of the underlying model.…”

Section: Introductionmentioning

confidence: 99%

Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio

et al. 2023

View full text Add to dashboard Cite

Feature selection in high dimensional gene expression datasets not only reduces the dimension of the data, but also the execution time and computational cost of the underlying classifier. The current study introduces a novel feature selection method called weighted signal to noise ratio (WSNR) by exploiting the weights of features based on support vectors and signal to noise ratio, with an objective to identify the most informative genes in high dimensional classification problems. The combination of two state-of-the-art procedures enables the extration of the most informative genes. The corresponding weights of these procedures are then multiplied and arranged in decreasing order. Larger weight of a feature indicates its discriminatory power in classifying the tissue samples to their true classes. The current method is validated on eight gene expression datasets. Moreover, results of the proposed method (WSNR) are also compared with four well known feature selection methods. We found that the (WSNR) outperform the other competing methods on 6 out of 8 datasets. Box-plots and Bar-plots of the results of the proposed method and all the other methods are also constructed. The proposed method is further assessed on simulated data. Simulation analysis reveal that (WSNR) outperforms all the other methods included in the study.

show abstract

Binary dwarf mongoose optimizer for solving high-dimensional feature selection problems

Cited by 21 publications

References 77 publications

Feature Selection Problem and Metaheuristics: A Systematic Literature Review about Its Formulation, Evaluation and Applications

Feature Selection Problem and Metaheuristics: A Systematic Literature Review about Its Formulation, Evaluation and Applications

Evolutionary binary feature selection using adaptive ebola optimization search algorithm for high-dimensional datasets

Feature selection for high dimensional microarray gene expression data via weighted signal to noise ratio

Contact Info

Product

Resources

About