Towards an Unsupervised Feature Selection Method for Effective Dynamic Features

Almusallam, Naif; Tari, Zahir; Chan, Jeffrey; Fahad, Adil; Alabdulatif, Abdulatif; Al-Naeem, Mohammed

doi:10.1109/access.2021.3082755

Cited by 17 publications

(3 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Embedded methods 11 – 13 such as LASSO and Decision Trees, incorporate feature selection within the algorithm itself for different scenarios. These methods usually provide a good trade-off between performance and speed but are limited by the biases of the algorithms they are integrated with for different use cases 14 – 16 .…”

Section: In-depth Review Of Existing Machine Learning Models Used For...mentioning

confidence: 99%

Feature importance feedback with Deep Q process in ensemble-based metaheuristic feature selection algorithms

Potharlanka,

2024

Sci Rep

View full text Add to dashboard Cite

Feature selection is an indispensable aspect of modern machine learning, especially for high-dimensional datasets where overfitting and computational inefficiencies are common concerns. Traditional methods often employ either filter, wrapper, or embedded approaches, which have limitations in terms of robustness, computational load, or capability to capture complex interactions among features. Despite the utility of metaheuristic algorithms like Particle Swarm Optimization (PSO), Firefly Algorithm (FA), and Whale Optimization (WOA) in feature selection, there still exists a gap in efficiently incorporating feature importance feedback into these processes. This paper presents a novel approach that integrates the strengths of PSO, FA, and WOA algorithms into an ensemble model and further enhances its performance by incorporating a Deep Q-Learning framework for relevance feedbacks. The Deep Q-Learning module intelligently updates feature importance based on model performance, thereby fine-tuning the selection process iteratively. Our ensemble model demonstrates substantial gains in effectiveness over traditional and individual metaheuristic approaches. Specifically, the proposed model achieved a 9.5% higher precision, an 8.5% higher accuracy, an 8.3% higher recall, a 4.9% higher AUC, and a 5.9% higher specificity across multiple software bug prediction datasets and samples. By resolving some of the key issues in existing feature selection methods and achieving superior performance metrics, this work paves the way for more robust and efficient machine learning models in various applications, from healthcare to natural language processing scenarios. This research provides an innovative framework for feature selection that promises not only superior performance but also offers a flexible architecture that can be adapted for a variety of machine learning challenges.

show abstract

Section: In-depth Review Of Existing Machine Learning Models Used For...mentioning

confidence: 99%

Feature importance feedback with Deep Q process in ensemble-based metaheuristic feature selection algorithms

Potharlanka,

2024

Sci Rep

View full text Add to dashboard Cite

show abstract

“…In a text mining assignment for spam filtering, for example, additional features (e.g., words) are dynamically created and must therefore be exploited to filter out the spam instead of waiting for every characteristic to be collected. Traditional methodologies, which have not been developed for streaming information applications, cannot be employed in this situation since they demand that the whole extracted feature set be known beforehand to evaluate the effective attributes effectively and scientifically [2,3]. Parkinson's disease is a widespread neurological disorder.…”

Section: Literature Reviewmentioning

confidence: 99%

A Framework for Blended Sub Feature Engineering for Chronic Disease Prediction Using in-Memory Computing

Raghavendra¹,

Mahesh²,

Rao³

2022

RIA

View full text Add to dashboard Cite

Chronic diseases are among the most frequent major health concerns. Early detection of chronic illnesses can help to avoid or lessen their repercussions, potentially lowering death rates. It's an innovative technique to use machine learning algorithms to identify dangerous variables. The problem with existing feature selection procedures is that each method gives a unique collection of features that influence model validity, and current methods are incapable of performing effectively on large multidimensional datasets. We would want to present a novel model that uses a feature selection strategy to choose ideal features from large multidimensional data sets to deliver credible forecasts of chronic diseases while preserving the uniqueness of the data. To assure the success of our proposed model, we used balanced classes by applying hybrid balanced class sampling methods to the original dataset, as well as methods to provide valid data for the training model, characterization and data conversion are required. Our model was run and assessed on datasets with binary and multi-valued classifications. We utilized a variety of datasets (Parkinson's disease, arrhythmia, breast cancer, kidney disease, and diabetes). To select suitable features, the hybrid feature model is used, which includes six ensemble models and involves voting on attributes. The accuracy of the original dataset before applying the framework is recorded and compared to the accuracy of the reduced set of characteristics. The findings are given individually to allow for comparisons. We can conclude from the results that our proposed model performed best on multi-valued class datasets rather than binary class characteristics.

show abstract

“…Since the datasets have large amounts of information and all data are not required for processing. When the dataset dimension expands then the classification accuracy of the system diminishes also it takes extra time for processing [2]. To avoid this problem feature selection helps to select only the necessary information from the datasets.…”

Section: Introductionmentioning

confidence: 99%

A Novel Chaos Quasi-Oppositional based Flamingo Search Algorithm with Simulated Annealing for Feature Selection

Durgam,

Devarakonda

2023

IJRITCC

View full text Add to dashboard Cite

In present situations feature selection is one of the most vital tasks in machine learning. Diminishing the feature set helps to increase the accuracy of the classifier. Due to large number of information’s present in the dataset it is a tremendous process to select the necessary features from the dataset. So, to solve this problem a novel Chaos Quasi-Oppositional based Flamingo Search Algorithm with Simulated Annealing algorithm (CQOFSA-SA) is proposed for feature selection and to select the optimal feature set from the datasets and thus it shrinks the dimension of the dataset. The FSA approach is used to choose the optimal feature subset from the dataset. In each iteration, the optimal solution of FSA is enriched by Simulated Annealing (SA). TheChaos Quasi-Oppositional based learning (CQOBL) included in the initialization of FSA improves the convergence rate and increases the searching capability of the FSA approach in choosing the optimal feature set. From the experimental outcomes, it is proved that the proposed CQOFSA-SA outperforms other feature selection approaches in terms of accuracy, optimal reduced feature set, fast convergence and fitness value.

show abstract

Towards an Unsupervised Feature Selection Method for Effective Dynamic Features

Cited by 17 publications

References 21 publications

Feature importance feedback with Deep Q process in ensemble-based metaheuristic feature selection algorithms

Feature importance feedback with Deep Q process in ensemble-based metaheuristic feature selection algorithms

A Framework for Blended Sub Feature Engineering for Chronic Disease Prediction Using in-Memory Computing

A Novel Chaos Quasi-Oppositional based Flamingo Search Algorithm with Simulated Annealing for Feature Selection

Contact Info

Product

Resources

About