2011
DOI: 10.1002/cem.1358
|View full text |Cite
|
Sign up to set email alerts
|

One stop shopping: feature selection, classification and prediction in a single step

Abstract: We report on the application of a genetic algorithm (GA) for pattern recognition that uses both supervised and transverse learning to mine spectroscopic and proteomic data. The pattern recognition GA selects features that optimize the separation of the classes in a plot of the two or three largest principal components of the data. For training sets with small amounts of labeled data (i.e. data points tagged with a class label) and large amounts of unlabeled data (i.e. data points that are not tagged with a cla… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
19
0

Year Published

2012
2012
2023
2023

Publication Types

Select...
7
2

Relationship

5
4

Authors

Journals

citations
Cited by 24 publications
(19 citation statements)
references
References 32 publications
0
19
0
Order By: Relevance
“…34 This function employs the Hopkins statistic to assess sample clustering. 35–37 By coupling the Hopkins statistic to PCKaNN, features are selected to optimize clustering in the PC plot using all of the data points (both the training set and the blind samples via the Hopkins statistic) while simultaneously seeking to identify features that create class separation using only the labeled data points (training set samples via PCKaNN).…”
Section: Methodsmentioning
confidence: 99%
“…34 This function employs the Hopkins statistic to assess sample clustering. 35–37 By coupling the Hopkins statistic to PCKaNN, features are selected to optimize clustering in the PC plot using all of the data points (both the training set and the blind samples via the Hopkins statistic) while simultaneously seeking to identify features that create class separation using only the labeled data points (training set samples via PCKaNN).…”
Section: Methodsmentioning
confidence: 99%
“…Over time, the pattern recognition GA learns its optimal parameters in a manner similar to a neural network. Further details about the fitness function of the pattern recognition GA used in this study can be found elsewhere [16][17][18][19][20][21].…”
Section: Pattern Recognition Methodologymentioning
confidence: 99%
“…The pattern recognition GA identifies a set of spectral features that optimize the separation of the classes in a plot of the two or three largest principal components (PCs) of the data. Because PCs maximize variance, the bulk of the information encoded by the selected features is about differences between the classes in the data set.…”
Section: Methodsmentioning
confidence: 99%