Abstract--A feature ranking scheme for MLP ensembles is proposed, along with a stopping criterion based upon the out-ofbootstrap (OOB) estimate. To solve multi-class problems feature ranking is combined with modified Error-Correcting Output Coding (ECOC). Experimental results on benchmark data demonstrate the versatility of the MLP base classifier in removing irrelevant features.
Index terms-Classification, Multilayer Perceptrons, PatternAnalysis, Pattern Recognition.
INTRODUCTIONWhether an individual classifier or an ensemble of classifiers is employed to solve a supervised learning problem, finding relevant features for discrimination is important. Most previous research on feature relevancy has focussed on individual classifiers, but in this paper the issue is addressed for an ensemble of Multi-layer perceptron (MLP) classifiers. The extension of feature relevancy to classifier ensembles is not straightforward, because of the inherent trade-off between accuracy and diversity [1]. The trade-off has long been recognised, and arises because diversity must decrease as base classifiers approach the highest levels of accuracy. There is no consensus on the best way to measure ensemble diversity, and the relationship between irrelevant features and diversity is not known.Feature relevancy is particularly important for small sample size problems, that is when the number of patterns is fewer than the number of features [2]. With tens of features in the original set, feature selection using an exhaustive search is computationally prohibitive. Since the problem is known to be NP-hard [3], a greedy search scheme is required, and filter, wrapper and embedded approaches have been developed [4]. The advantage of an embedded method is that feature selection is inherent in the classifier itself, and there is no reliance upon a measure that is independent of the classifier.Feature ranking is conceptually one of the simplest search schemes for feature selection, and has the advantage of scaling up to hundreds of features. Uni-dimensional featureranking methods consider each feature in isolation, but are disadvantaged by the implicit orthogonality assumption [4], whereas multi-dimensional methods consider correlations with remaining features. In this paper, we propose an ensemble of MLP classifiers that incorporates multidimensional feature ranking based on MLP weights. The ensemble contains a simple parallel Multiple Classifier System (MCS) architecture with homogenous MLP base classifiers. There has not been any systematic comparison of feature ranking methods in the context of MCS. Most previous approaches to feature selection with ensembles have focused on determining feature subsets to combine, but differ in the way the subsets are chosen. The Random Subspace Method (RSM) is the best-known method, and it was shown in [7] that a random choice of feature subset (allowing a single feature to be in more than one subset), improves performance for high-dimensional problems. In [2], forward feature and random (without replacement) ...