2011
DOI: 10.1186/1471-2105-12-450
|View full text |Cite
|
Sign up to set email alerts
|

Random KNN feature selection - a fast and stable alternative to Random Forests

Abstract: BackgroundSuccessfully modeling high-dimensional data involving thousands of variables is challenging. This is especially true for gene expression profiling experiments, given the large number of genes involved and the small number of samples available. Random Forests (RF) is a popular and widely used approach to feature selection for such "small n, large p problems." However, Random Forests suffers from instability, especially in the presence of noisy and/or unbalanced inputs.ResultsWe present RKNN-FS, an inn… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
65
0
1

Year Published

2012
2012
2022
2022

Publication Types

Select...
5
2
2

Relationship

0
9

Authors

Journals

citations
Cited by 106 publications
(66 citation statements)
references
References 32 publications
0
65
0
1
Order By: Relevance
“…In Drzewiecki (2016b) nine machine learning (ML) regression algorithms were tested: Cubist (Quinlan, 1993), Random Forest (RF) (Breiman, 2001), stochastic gradient boosting of regression trees (GBM) (Friedman, 2002), k-nearest neighbors (kNN), random k-nearest neighbors (rkNN) (Li et al, 2011), Multivariate Adaptive Regression Splines (MARS) (Friedman, 1991), averaged neural networks (avNN) (Ripley, 1996), support vector machines (Smola and Schölkopf, 2004) with polynomial (SVMp) and radial (SVMr) kernels. For every study area, each of them was used to predict imperviousness for both mid 1990s and late 2000s.…”
Section: Detection Of Relevant Changesmentioning
confidence: 99%
“…In Drzewiecki (2016b) nine machine learning (ML) regression algorithms were tested: Cubist (Quinlan, 1993), Random Forest (RF) (Breiman, 2001), stochastic gradient boosting of regression trees (GBM) (Friedman, 2002), k-nearest neighbors (kNN), random k-nearest neighbors (rkNN) (Li et al, 2011), Multivariate Adaptive Regression Splines (MARS) (Friedman, 1991), averaged neural networks (avNN) (Ripley, 1996), support vector machines (Smola and Schölkopf, 2004) with polynomial (SVMp) and radial (SVMr) kernels. For every study area, each of them was used to predict imperviousness for both mid 1990s and late 2000s.…”
Section: Detection Of Relevant Changesmentioning
confidence: 99%
“…Random forests (RF) is one of the most important supervised methods for feature gene selection (16)(17)(18). During the classifying process, RF returns several measures of vari-able importance.…”
Section: Methodsmentioning
confidence: 99%
“…With regard to classification model selection, different algorithms have been studied for the identification of differentially expressed genes in genomic data. Classification methods such as Multilayer Perceptron (NN) [23], [24], [15], Support Vector Machines (SVM) [25], Naive Bayes (NB) [26], k-Nearest Neighbour (kNN) [27], Decision Trees (DT) [28], and RF (Random Forest) [29] have been used in recent studies. Finally, prediction assessment refers to the performance of the predictive models.…”
Section: Introductionmentioning
confidence: 99%