2018
DOI: 10.18637/jss.v085.i12
|View full text |Cite
|
Sign up to set email alerts
|

rmcfs: An R Package for Monte Carlo Feature Selection and Interdependency Discovery

Abstract: We describe the R package rmcfs that implements an algorithm for ranking features from high dimensional data according to their importance for a given supervised classification task. The ranking is performed prior to addressing the classification task per se. This R package is the new and extended version of the MCFS (Monte Carlo feature selection) algorithm where an early version was published in 2005. The package provides an easy R interface, a set of tools to review results and the new ID (interdependency d… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
30
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
8
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 31 publications
(30 citation statements)
references
References 29 publications
0
30
0
Order By: Relevance
“…Then, we identified genomic marks important for classification of compartments. We used the relative importance (RI) of the feature values, an important statistical measure from the Monte Carlo Feature Selection method and ANOVA test, to indicate the most informative features for a classifier [ 9 , 11 ]. The ranking of the features was based on the product of RI and ANOVA f-statistic.…”
Section: Resultsmentioning
confidence: 99%
“…Then, we identified genomic marks important for classification of compartments. We used the relative importance (RI) of the feature values, an important statistical measure from the Monte Carlo Feature Selection method and ANOVA test, to indicate the most informative features for a classifier [ 9 , 11 ]. The ranking of the features was based on the product of RI and ANOVA f-statistic.…”
Section: Resultsmentioning
confidence: 99%
“…Monte-Carlo Feature Selection (MCFS) and inter-dependency discovery has been used for ranking the feature importance. In MCFS the relative importance of features is estimated by building hundreds of trees for a randomly selected subset of features [35]. In a mathematic notion, i subsets of m randomly selected features are constructed where m << n, n being the total number of features and for each subset, k trees are constructed and their performance is assessed for classification/ regression.…”
Section: Machine-learning Driven Methods For Sds Estimationmentioning
confidence: 99%
“…With consideration of high dimensionality and small sample size of the 450k methylation data, embedded feature selection methods could be a practical choice for the appropriate computation complexity. Thus, we choose R packages “glmnet,” “MDFS” and “rmcfs” as the basic feature selection approaches (Friedman et al, 2010; Draminski and Koronacki, 2018; Piliszek et al, 2018). Taking the advantages of combing L1 and L2 regularization (elastic net), glmnet can achieve variable extraction for the microarray data with high dimension but small number of samples.…”
Section: Methodsmentioning
confidence: 99%