Abstract. Feature Subset Selection is an essential pre-processing task in Data Mining. Feature selection process refers to choosing subset of attributes from the set of original attributes. This technique attempts to identify and remove as much irrelevant and redundant information as possible. In this paper, a new feature subset selection algorithm based on conditional mutual information approach is proposed to select the effective feature subset. The effectiveness of the proposed algorithm is evaluated by comparing with the other well-known existing feature selection algorithms using standard datasets from UC Iravine and WEKA (Waikato Environment for Knowledge Analysis). The performance of the proposed algorithm is evaluated by multi-criteria that take into account not only the classification accuracy but also number of selected features.
Feature or attribute selection is a topic that concerns selecting a subset of features, among the full features, that shows the best performance in classification accuracy. It performs as a preprocessing step to improve the classification task. The main objective of feature selection is to find useful features that represent the data and remove those features that are either irrelevant or redundant. Reducing the number of features in a dataset can lead to faster software quality model training and improved classifier performance. This paper presents a new method for dealing with feature subset selection based on conditional mutual information. The proposed method can select feature subset with minimum number of features, which are relevant to get higher average classification accuracy for datasets. The experimental results with UC Iravine datasets and Naïve Bayes classifier showed that the proposed algorithm is effective and efficient in selecting subset with minimum number of features getting higher classification accuracy than the existing feature selection methods.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.