Feature selection is essential in high-dimensional data analysis and filter algorithms, and due to their simplicity and fast speed, they have increasingly been drawing attention in recent years. Retaining all features in machine learning tasks is not only inefficient but the irrelevant and redundant features may have an adverse impact on the classification accuracy rate. Feature selection is an optimization problem which aims to transform the dataset’s high-dimensional space to a lower-dimensional space by utilizing the relevant and suited features. Feature selection is a time-consuming task, while it is very effective in saving the time devoted to the learning algorithm. In feature selection algorithms, filter algorithms are increasingly attractive due to their simplicity and fast speed. In this paper, we are going to introduce a supervised filter feature selection using filled function and fisher score (FFFS). Based on this criterion, we try to find a feature subset resulting in the least classification error rate. In order to prove the effectiveness of the proposed algorithm, Extensive experiments have been conducted on 20 high-dimensional real-world datasets. Experimental results reveal the superiority of the proposed algorithm to state-of-the-art algorithms in terms of minimum classification error rate. Results validated through statistical analysis indicated that the proposed algorithm is able to outperform the reference algorithms by minimizing the redundancy of the selected features. So, the selected feature subset can avoid serious negative impacts on the classification process in real-world datasets. In addition, this paper proves the ability of the proposed algorithm in selecting the most relevant features for classification tasks by applying different noise rates to the datasets. According to the experiments, the FFFS is less affected by noisy attributes in comparison with other algorithms. Thus, it is a reasonable solution in handling noise and avoiding serious negative impacts on the classification error rate in real-world datasets.
Feature selection is essential in high-dimensional data analysis and filter algorithms, and due to their simplicity and fast speed, they have increasingly been drawing attention in recent years. Retaining all features in machine learning tasks is not only inefficient but the irrelevant and redundant features may have an adverse impact on the classification accuracy rate. Feature selection is an optimization problem which aims to transform the dataset’s high-dimensional space to a lower-dimensional space by utilizing the relevant and suited features. Feature selection is a time-consuming task, while it is very effective in saving the time devoted to the learning algorithm. In feature selection algorithms, filter algorithms are increasingly attractive due to their simplicity and fast speed. In this paper, we are going to introduce a supervised filter feature selection using filled function and fisher score (FFFS). Based on this criterion, we try to find a feature subset resulting in the least classification error rate. In order to prove the effectiveness of the proposed algorithm, Extensive experiments have been conducted on 20 high-dimensional real-world datasets. Experimental results reveal the superiority of the proposed algorithm to state-of-the-art algorithms in terms of minimum classification error rate. Results validated through statistical analysis indicated that the proposed algorithm is able to outperform the reference algorithms by minimizing the redundancy of the selected features. So, the selected feature subset can avoid serious negative impacts on the classification process in real-world datasets. In addition, this paper proves the ability of the proposed algorithm in selecting the most relevant features for classification tasks by applying different noise rates to the datasets. According to the experiments, the FFFS is less affected by noisy attributes in comparison with other algorithms. Thus, it is a reasonable solution in handling noise and avoiding serious negative impacts on the classification error rate in real-world datasets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.