2014
DOI: 10.5120/16456-2390
|View full text |Cite
|
Sign up to set email alerts
|

Survey on Feature Selection for Data Reduction

Abstract: The storage capabilities and advanced in data collection has led to an information load and the size of databases increases in dimensions, not only in rows but also in columns. Data reduction (DR) plays a vital role as a data prepossessing techniques in the area of knowledge discovery from the huge collection of data. Feature selection (FS) is one of the well known data reduction techniques, which deals with the reduction of attributes from the original data without affecting the main information content. Base… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
6
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(7 citation statements)
references
References 56 publications
1
6
0
Order By: Relevance
“…757 may act as a way to reduce the dimensionality and ease the computation of KNN model but it does not influence the performance of the model. This result is supported by previous studies where they claimed there is a high tendency that the complexity of the computation is being reduced without affecting the performance of a classification model [23][24]. Hence, this study found that the accuracy of KNN model remains the same with the application of feature selection towards normally distributed data set.…”
Section: Resultssupporting
confidence: 87%
“…757 may act as a way to reduce the dimensionality and ease the computation of KNN model but it does not influence the performance of the model. This result is supported by previous studies where they claimed there is a high tendency that the complexity of the computation is being reduced without affecting the performance of a classification model [23][24]. Hence, this study found that the accuracy of KNN model remains the same with the application of feature selection towards normally distributed data set.…”
Section: Resultssupporting
confidence: 87%
“…Nevertheless, it is clear from this work that the use of PCA with varimax rotation factor loads contributed to a higher identification performance than the traditional PCA methodology. Even though the classification reliability of PCA varimax and the choice of all features are shown to be poor, it is important to note that around half of the information is decreased and that the processing time used for non-trivial real-time classification decreases dramatically [17][18][19]. The k-NN method gives, ultimately, 3.5% error of the system tested by choosing features using PCA with a varimax rotation, the highest classification level.…”
Section: Fig 42 Temporal Window Extraction For Feeding Processmentioning
confidence: 99%
“…Despite the fact that the feature subset selection methods are well known for more than four decades, a huge number of new feature selection surveys was published during last few years ( [5], [6], [7], [8], [9], [10], [11], [12]). This supports the well known and still increasing importance of a good feature selection for the design of prediction and classification systems.…”
Section: Introductionmentioning
confidence: 99%
“…This supports the well known and still increasing importance of a good feature selection for the design of prediction and classification systems. For wrapper approaches, the reviews mostly adopt the following claims: (1) wrappers often reach better results than filters, because their feature selection criteria are more natural representatives of the prediction quality and are tailored to the 978-1-4673-6727-1/15$31.00 c 2015 IEEE particular predictor [5], [11] 1 , (2) wrappers are often much slower than filters, because of their time consuming feature selection criteria [6], [5], (3) wrappers exhibit higher risk of over-fitting [11]. This paper proves significant benefits of two particular wrapper approaches in terms of high dimensionality reduction and for most datasets also negligible performance degradation.…”
Section: Introductionmentioning
confidence: 99%