2004
DOI: 10.1007/978-3-540-30115-8_32
|View full text |Cite
|
Sign up to set email alerts
|

Feature Selection Filters Based on the Permutation Test

Abstract: Abstract. We investigate the problem of supervised feature selection within the filtering framework. In our approach, applicable to the two-class problems, the feature strength is inversely proportional to the p-value of the null hypothesis that its class-conditional densities, p(X | Y = 0) and p(X | Y = 1), are identical. To estimate the p-values, we use Fisher's permutation test combined with the four simple filtering criteria in the roles of test statistics: sample mean difference, symmetric Kullback-Leible… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
35
0
2

Year Published

2007
2007
2021
2021

Publication Types

Select...
8
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 40 publications
(37 citation statements)
references
References 31 publications
0
35
0
2
Order By: Relevance
“…Feature filtering with the mutual information and the permutation test was also recently proposed [6,28,26], in a pure feature ranking approach where the permutation test is used to automatically set a threshold on the value of the mutual information.…”
Section: Combined Usesmentioning
confidence: 99%
“…Feature filtering with the mutual information and the permutation test was also recently proposed [6,28,26], in a pure feature ranking approach where the permutation test is used to automatically set a threshold on the value of the mutual information.…”
Section: Combined Usesmentioning
confidence: 99%
“…Because of these reasons, the variable importance statistical framework introduced by Breiman (2001), formalized in van der Laan (2006), and improved by Radivojac et al (2004) is what we will use in this paper.…”
Section: A Variable Importancementioning
confidence: 99%
“…The features were reduced to an "optimal" set using forward selection [25], which consists essentially of ranking the features in order of relevance to the classification problem and evaluating the performance of the classifier using an increasing number of features. We computed the relevance of each feature using the information gain ratio [25] metric, which has been found to be very reliable in previous applications [26]. However, the features need to be discretized; Fayyad and Irani's minimum description length algorithm [27] was used for this.…”
Section: ) Optimal Feature Set Selectionmentioning
confidence: 99%