2013
DOI: 10.1016/j.patcog.2012.09.005
|View full text |Cite
|
Sign up to set email alerts
|

Stratified sampling for feature subspace selection in random forests for high dimensional data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
71
1

Year Published

2014
2014
2024
2024

Publication Types

Select...
4
4

Relationship

0
8

Authors

Journals

citations
Cited by 146 publications
(73 citation statements)
references
References 32 publications
1
71
1
Order By: Relevance
“…Therefore, the land cover map of North Korea, classified using spot images, was referenced to distinguish deforestation types. This sampling method was demonstrated to have high accuracy for use with RF [40]. All training points were selected across North Korea with point centers at least 500 m apart [15].…”
Section: Training Samples Collectionmentioning
confidence: 99%
See 1 more Smart Citation
“…Therefore, the land cover map of North Korea, classified using spot images, was referenced to distinguish deforestation types. This sampling method was demonstrated to have high accuracy for use with RF [40]. All training points were selected across North Korea with point centers at least 500 m apart [15].…”
Section: Training Samples Collectionmentioning
confidence: 99%
“…The sample size per class was set to a minimum of 200 points for classification [41], and the total sample size was 1660 points containing the survey points (Figure 3). This sampling method was demonstrated to have high accuracy for use with RF [40]. All training points were selected across North Korea with point centers at least 500 m apart [15].…”
Section: Training Samples Collectionmentioning
confidence: 99%
“…[6] used random forest for both gene selection and gene classification in DNA Microarray data, benefitting from the variable importance measure that is offered as a byproduct of random forest. In recent years, many studies have explored extensions and improvement of the original random forest idea, some in a spirit similar to our present work like [23] who select the features for the subspace using weights inspired by the relationship between a given variable and the response. In the context of high dimensional response (output) space, [15] is yet another interesting adaptation of random forest aimed at attaining every greater predictive performances.…”
Section: Problem Formulationmentioning
confidence: 99%
“…Some authors before use, like [23], in their recent work stratified sampling for feature subspace selection in random forests for high dimensional data, have weighted the trees comprising the random subspace ensembles. However, the manner in which they implement data-driven weights in their stratified sampling scheme is markedly different from our method.…”
Section: Problem Formulationmentioning
confidence: 99%
“…RFs have been shown to particularly work well in many prediction problems with a large number of predictors [1,12,17]. There can also be benefit if RFs are applied in some problems with a much lower dimension, potentially producing better predictions than other procedures.…”
Section: Improvement By Variable Augmentation In Real Data Examplesmentioning
confidence: 99%