2014
DOI: 10.1073/pnas.1220873111
|View full text |Cite
|
Sign up to set email alerts
|

Knowledge discovery by accuracy maximization

Abstract: Here we describe KODAMA (knowledge discovery by accuracy maximization), an unsupervised and semisupervised learning algorithm that performs feature extraction from noisy and high-dimensional data. Unlike other data mining methods, the peculiarity of KODAMA is that it is driven by an integrated procedure of crossvalidation of the results. The discovery of a local manifold's topology is led by a classifier through a Monte Carlo procedure of maximization of cross-validated predictive accuracy. Briefly, our approa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
35
0
1

Year Published

2015
2015
2024
2024

Publication Types

Select...
8

Relationship

3
5

Authors

Journals

citations
Cited by 51 publications
(36 citation statements)
references
References 41 publications
0
35
0
1
Order By: Relevance
“…Recently, Cacciatore et al proposed a new method-knowledge discovery by accuracy maximization [34], which is used for feature extraction from noised and high-dimensional data. The peculiarity of this algorithm is that it can find the topology of the local manifold of the data through maximization of predictive accuracy with a classifier.…”
Section: B Band Selection With Clustering Based On Classifiermentioning
confidence: 99%
See 1 more Smart Citation
“…Recently, Cacciatore et al proposed a new method-knowledge discovery by accuracy maximization [34], which is used for feature extraction from noised and high-dimensional data. The peculiarity of this algorithm is that it can find the topology of the local manifold of the data through maximization of predictive accuracy with a classifier.…”
Section: B Band Selection With Clustering Based On Classifiermentioning
confidence: 99%
“…The peculiarity of this algorithm is that it can find the topology of the local manifold of the data through maximization of predictive accuracy with a classifier. Inspired by this, we use the maximization of cross-validated accuracy of [34] to cluster similar bands, which can automatically determine the number of clusters according to the data distribution.…”
Section: B Band Selection With Clustering Based On Classifiermentioning
confidence: 99%
“…We denote it by LLE(X). However, in practice, the data matrix such as the genetic data studied in [34,5] often has missing values. A common approach is to impute all the missing values.…”
Section: Our Formulation and Relevant Research Supposementioning
confidence: 99%
“…To compare the efficiency of the audio recognition process with standard chemometrics analysis, the corresponding eight NMR spectra were analyzed, after Fourier transformation and pre-processing, using the clustering algorithm k-means, which is one of the most widely used and best performing clustering algorithms (Cacciatore et al, 2014;MacQueen, 1967). The clustering obtained by k-means algorithm achieved an ARI score of 0.16, which is larger than the score for non-musicians, but interestingly, much smaller than those obtained by musicians.…”
Section: Clustering Testmentioning
confidence: 99%
“…Its individuality is given by the interplay between different concentration levels and by covariance patterns of different molecules. Disentangling and extracting the constituent building blocks of the metabolic fingerprint from high-dimensional data requires the application of multivariate statistical tools and/or machine learning algorithms (Cacciatore et al, 2014;Jansen et al, 2005;Saccenti et al, 2013;Trygg et al, 2007;Weckwerth et al, 2005), the latter often requiring a large number of samples in order to build and train predictive models (Stockwell et al, 2002).…”
Section: Introductionmentioning
confidence: 99%