Bootstrap Feature Selection for Ensemble Classifiers

Duangsoithong, Rakkrit; Windeatt, Terry

doi:10.1007/978-3-642-14400-4_3

Cited by 12 publications

(6 citation statements)

References 14 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…And the whole process of experiments was run on Matlab. By using the bootstrap [34], the dataset was divided into 75% training data and 25% testing data at a ratio of 3:1 approximately. Then seven kinds of feature selection methods were chosen to select the different ratios of subsets to perform the experiment (Tables 2, 4, 5, and 6).…”

Section: Methods Of Experimentsmentioning

confidence: 99%

Ensemble Classification Based on Feature Selection for Environmental Sound Recognition

Zhao

Zhang

et al. 2019

Mathematical Problems in Engineering

View full text Add to dashboard Cite

Environmental sound recognition has been a hot topic in the domain of audio recognition. How to select the optimal feature subsets and enhance the performance of classification precisely is an urgent problem to be solved. Ensemble learning, a new kind of method presented recently, has been an effective way to improve the accuracy of classification in feature selection. In this paper, experiments were performed on environmental sound dataset. An improved method based on constraint score and multimodels ensemble feature selection methods (MmEnFs) were exploited in the experiments. The experimental results show that when enough attributes are selected, the improved method can get a better performance compared to other feature selection methods. And the ensemble feature selection method, which combines other methods, can obtain the optimal performance in most cases.

show abstract

Section: Methods Of Experimentsmentioning

confidence: 99%

Ensemble Classification Based on Feature Selection for Environmental Sound Recognition

Zhao

Zhang

et al. 2019

Mathematical Problems in Engineering

View full text Add to dashboard Cite

show abstract

“…For the high-dimensional RSFC, feature reduction is indispensable since redundant or irrelevant information may confound the statistical testing significance (Bunea et al, 2011), worsen the machine learning model performance (Arbabshirani et al, 2017;Duangsoithong et al, 2010) and increase computational complexity. For the regression tasks, correlation analysis between RSFC and the target phonotypic measure (e.g.…”

Section: Introductionmentioning

confidence: 99%

Bootstrapping promotes the RSFC-behavior associations: an application of individual cognitive traits prediction

Wei

Jing

2019

Preprint

View full text Add to dashboard Cite

Resting state functional connectivity records enormous functional interaction information between any pair of brain nodes, which enriches the prediction of individual phenotypes. To reduce the high dimensional features in prediction, correlation analysis is a common way for feature selection. However, rs-fMRI signal exhibits typically low signal-to-noise ratio and correlation analysis is sensitive to outliers and data distribution, which may bring unstable and uninformative features to subsequent prediction. To alleviate this problem, a bootstrapping-based feature selection framework was proposed and applied on three widely used regression models: connectome-based predictive model (CPM), support vector regression (SVR) and least absolute shrinkage and selection operator (LASSO). A large open-source dataset from Human Connectome Project (HCP) was adopted in the study and a series of cognitive traits were acted as the prediction targets. To systematically investigate the influences of different parameter settings on the bootstrapping-based framework, a total of 216 parameter combinations were evaluated through the R value between the predicted and real cognitive traits, and the best identified performance among them was chosen out as the final prediction accuracy for each cognitive trait. By using bootstrapping without replacement, the best performances of CPM with positive and negative feature sets, SVR and LASSO averagely increased by 28.0%, 33.2%, 11.6% and 24.3% in R values in contrast to the baseline method without bootstrapping. By using bootstrapping with replacement, these best performances increased by 22.1%, 22.9%, 9.4% and 19.6%. Furthermore, the bootstrapping-based feature selection methods could effectively refine the original feature sets obtained from correlation analysis, which thus retained the more stable and informative feature sets. The results demonstrate that bootstrapping-based feature selection is an easy-to-use and effective method to improve RSFC prediction of cognitive traits and is highly recommended in future RSFC prediction studies.

show abstract

“…Ranking subsets of randomly chosen features before combining was reported in [9]. Bootstrap feature selection for ensembles was proposed in [10].…”

Section: Introductionmentioning

confidence: 99%

“…Ranking subsets of randomly chosen features before combining was reported in [9]. Bootstrap feature selection for ensembles was proposed in [10].The main contributions are 1) feature ranking using ensemble MLP weights combined with RFE 2) OOB stopping criterion for optimal feature selection 3) extension to multi-class problems by combing RFE with weighted ECOC decoding strategy, and 4) incorporation of OOB estimate into ECOC decoding.The paper is organised as follows. In Section 2, six feature ranking strategies are described.…”

mentioning

confidence: 99%

Embedded Feature Ranking for Ensemble MLP Classifiers

Windeatt

Duangsoithong

Smith

2011

IEEE Trans. Neural Netw.

View full text Add to dashboard Cite

Abstract--A feature ranking scheme for MLP ensembles is proposed, along with a stopping criterion based upon the out-ofbootstrap (OOB) estimate. To solve multi-class problems feature ranking is combined with modified Error-Correcting Output Coding (ECOC). Experimental results on benchmark data demonstrate the versatility of the MLP base classifier in removing irrelevant features. Index terms-Classification, Multilayer Perceptrons, PatternAnalysis, Pattern Recognition. INTRODUCTIONWhether an individual classifier or an ensemble of classifiers is employed to solve a supervised learning problem, finding relevant features for discrimination is important. Most previous research on feature relevancy has focussed on individual classifiers, but in this paper the issue is addressed for an ensemble of Multi-layer perceptron (MLP) classifiers. The extension of feature relevancy to classifier ensembles is not straightforward, because of the inherent trade-off between accuracy and diversity [1]. The trade-off has long been recognised, and arises because diversity must decrease as base classifiers approach the highest levels of accuracy. There is no consensus on the best way to measure ensemble diversity, and the relationship between irrelevant features and diversity is not known.Feature relevancy is particularly important for small sample size problems, that is when the number of patterns is fewer than the number of features [2]. With tens of features in the original set, feature selection using an exhaustive search is computationally prohibitive. Since the problem is known to be NP-hard [3], a greedy search scheme is required, and filter, wrapper and embedded approaches have been developed [4]. The advantage of an embedded method is that feature selection is inherent in the classifier itself, and there is no reliance upon a measure that is independent of the classifier.Feature ranking is conceptually one of the simplest search schemes for feature selection, and has the advantage of scaling up to hundreds of features. Uni-dimensional featureranking methods consider each feature in isolation, but are disadvantaged by the implicit orthogonality assumption [4], whereas multi-dimensional methods consider correlations with remaining features. In this paper, we propose an ensemble of MLP classifiers that incorporates multidimensional feature ranking based on MLP weights. The ensemble contains a simple parallel Multiple Classifier System (MCS) architecture with homogenous MLP base classifiers. There has not been any systematic comparison of feature ranking methods in the context of MCS. Most previous approaches to feature selection with ensembles have focused on determining feature subsets to combine, but differ in the way the subsets are chosen. The Random Subspace Method (RSM) is the best-known method, and it was shown in [7] that a random choice of feature subset (allowing a single feature to be in more than one subset), improves performance for high-dimensional problems. In [2], forward feature and random (without replacement) ...

show abstract

Bootstrap Feature Selection for Ensemble Classifiers

Cited by 12 publications

References 14 publications

Ensemble Classification Based on Feature Selection for Environmental Sound Recognition

Ensemble Classification Based on Feature Selection for Environmental Sound Recognition

Bootstrapping promotes the RSFC-behavior associations: an application of individual cognitive traits prediction

Embedded Feature Ranking for Ensemble MLP Classifiers

Contact Info

Product

Resources

About