2011
DOI: 10.1007/978-1-4419-7046-6_10
|View full text |Cite
|
Sign up to set email alerts
|

Feature Selection in Gene Expression Data Using Principal Component Analysis and Rough Set Theory

Abstract: In many fields such as data mining, machine learning, pattern recognition and signal processing, data sets containing huge number of features are often involved. Feature selection is an essential data preprocessing technique for such high-dimensional data classification tasks. Traditional dimensionality reduction approach falls into two categories: Feature Extraction (FE) and Feature Selection (FS). Principal component analysis is an unsupervised linear FE method for projecting high-dimensional data into a low… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
13
0

Year Published

2011
2011
2019
2019

Publication Types

Select...
5
3
1

Relationship

1
8

Authors

Journals

citations
Cited by 28 publications
(13 citation statements)
references
References 7 publications
0
13
0
Order By: Relevance
“…First, initially the proposed model by Dash et al [20] and Mishra et al [21] has been evaluated on a synthetic dataset with 15 data objects having 10 attributes. Second, we have evaluated the MSR approach of Cheng & Church [2] and pattern based approach [1] to find the coherent patterns from the synthetic data set as well as Yeast data set.…”
Section: Experimental Evaluation and Results Analysismentioning
confidence: 99%
See 1 more Smart Citation
“…First, initially the proposed model by Dash et al [20] and Mishra et al [21] has been evaluated on a synthetic dataset with 15 data objects having 10 attributes. Second, we have evaluated the MSR approach of Cheng & Church [2] and pattern based approach [1] to find the coherent patterns from the synthetic data set as well as Yeast data set.…”
Section: Experimental Evaluation and Results Analysismentioning
confidence: 99%
“…Principal Component Analysis (PCA) [15][16][20] [21] is an unsupervised linear feature reduction method for projecting high dimensional data into a new low dimensional representation of the data that describes as much of the variance in the data as possible with minimum loss of information. PCA is mathematically defined as an orthogonal linear transformation that transforms the data to a new coordinate system such that the greatest variance by any projection of the data comes to lie on the first coordinate (called the first principal component), the second greatest variance on the second coordinate, and so on.…”
Section: Pca For Dimensionality Reductionmentioning
confidence: 99%
“…More than 20% of the attribute alleles discarded when these algorithms were applied on the original dataset. Each attribute weighting system uses a specific pattern to define the most important features by feature selection [37][39]. Thus, the results may be different [40], as has been highlighted in previous studies [13]–[17].…”
Section: Discussionmentioning
confidence: 98%
“…They have reported that PCA can decrease the computational time significantly, which can further be used for real-time purposes. Before using PCA, the extracted features are normalized using z-score to avoid scaling effects, while ensuring that any feature with larger domain will not dominate features with smaller domain [64,65,66]. A feature value X of a feature F is normalized to X using Equation (3):X=Xμ(F)σ(F) where, X is the feature value, μ(F) is the arithmetic mean of all values of feature F, σ(F) is the standard deviation of all values of feature F, and X is the normalized feature value.…”
Section: Methodsmentioning
confidence: 99%