2015
DOI: 10.1007/978-3-319-18781-5_2
|View full text |Cite
|
Sign up to set email alerts
|

Selection of Significant Features Using Monte Carlo Feature Selection

Abstract: Feature selection methods identify subsets of features in large datasets. Such methods have become popular in data-intensive areas, and performing feature selection prior to model construction may reduce the computational cost and improve the model quality. Monte Carlo Feature Selection (MCFS) is a feature selection method aimed at finding features to use for classification. Here we suggest a strategy using a z-test to compute the significance of a feature using MCFS. We have used simulated data with both info… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(3 citation statements)
references
References 9 publications
0
3
0
Order By: Relevance
“…For each feature, its RI in the original ranking (without any permutation) is compared against the corresponding distribution of RIs from the experiments with permuted decision attribute (the z test is used this time). This method has been described and studied in Bornelöv and Komorowski (2016). It gives similar results to the previous one but needs more MCFS runs.…”
Section: Determining the Cutoff Valuementioning
confidence: 94%
“…For each feature, its RI in the original ranking (without any permutation) is compared against the corresponding distribution of RIs from the experiments with permuted decision attribute (the z test is used this time). This method has been described and studied in Bornelöv and Komorowski (2016). It gives similar results to the previous one but needs more MCFS runs.…”
Section: Determining the Cutoff Valuementioning
confidence: 94%
“…In this study, a range of methods were used. These methods are outlined below; they have been reported on extensively previously [ 30‐32,37,39,41‐59 ] and detailed methods are also included as Appendix S1.…”
Section: Methodsmentioning
confidence: 99%
“…Determination of the importance of variables is not compulsory for the construction of random forests, but it is a subroutine to be made corresponding to the construction of the forest [39,40]. Features' ranking by variable significance can thus be considered a by-product of the classifier [41]. In this approach, we rely heavily on using the classifier, and we shall not use it for the classification work per se.…”
Section: Monte Carlo Feature Selection and Interdependency Discovery mentioning
confidence: 99%