<p><b>The quality of the data space, which is often represented by a set of features, is one of the most critical aspects affecting the classification performance of a machine learning algorithm. The existence of noisy, irrelevant, and/or redundant features often reduces the classification accuracy of the learning algorithms. To improve the quality of the data, feature selection is proposed to choose a small set of a given feature set that captures the properties of the data. However, very few studies pay attention on that there are usually multiple best feature subsets for a feature selection task. In other words, due to complex feature interactions, multiple feature subsets with different features selected can have very similar or the same classification performance. Many existing feature selection methods output only one feature subset since they discard the other candidate feature subsets with the same classification performances during the feature selection process, and therefore some good feature subsets are lost. Thanks to the potential global search ability, evolutionary computation (EC) methods, especially multimodal optimization techniques and differential evolution (DE), have been widely and successfully applied to address different tasks including feature selection. However, using these techniques to search for multiple best feature subsets in a feature selection task has not been systematically investigated.</b></p>
<p>The overall goal of this thesis is to investigate and improve the capability of evolutionary multimodal optimization techniques mainly niching-based DE for feature selection to search for multiple best feature subsets. The obtained multiple feature subsets are expected to have small subset sizes while maintaining or even improving the classification performance over using all features. Different aspects of feature selection are considered in this thesis such as the number of objectives (single-objective and multi-objective), the encoding schemes (binary and real-valued), and the search operators.</p>
<p>This thesis introduces a new multimodal DE method with duplication analysis for feature selection. The results show that the proposed method can find different feature subsets with very similar or the same classification performance, and therefore users can choose one based on their preferences. More importantly, the selected features achieve better classification performance than using the original whole feature set and features selected by some conventional methods and the compared niching-based feature selection methods.</p>
<p>This thesis develops a novel niching-based DE method for multi-objective feature selection, which aims to reduce the redundancy rate among the obtained feature subsets by considering the characteristics that different feature subsets with the same subset size can achieve very similar or the same classification performance. The experimental results show that the proposed feature selection method evolves a rich and diverse set of non-dominated solutions for different feature selection tasks.</p>
<p>This thesis proposes a new binary DE method for multi-objective feature selection. To improve the population diversity, during the environmental selection process, solutions (feature subsets) with large diversity scores in the candidate pool are preferred. The results show that the proposed method achieves significantly better feature selection performance than the current popular multi-objective feature selection methods.</p>
<p>This thesis introduces a new feature clustering-assisted feature selection method. The proposed method incorporates the correlation knowledge obtained by filter measures into the encoding and search process to search for multiple best feature subsets. The results show that the classification performance can be significantly improved by integrating the obtained feature subsets on some of the used high-dimensional datasets.</p>