2012
DOI: 10.5120/6236-8332
|View full text |Cite
|
Sign up to set email alerts
|

More work on K -Means Clustering Algorithm: The Dimensionality Problem

Abstract: The K-means clustering algorithm is an old algorithm that has been intensely researched owing to its simplicity of implementation. However, there have also been criticisms on its performance, in particular, for demanding the value of K a priori. It is evident from previous researches that providing the number of clusters a priori does not in any way assist in the production of good quality clusters. The objective of this paper is to investigate the usefulness of the K-means clustering in the clustering of high… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2013
2013
2020
2020

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 18 publications
(21 reference statements)
0
4
0
Order By: Relevance
“…The number of atoms: It is also possible to calculate the coordinates for all the atoms present in the molecule: coords <-do.call('rbind' , lapply(atoms, get.point2d)) coords R can compute a set of molecular descriptors, grouped into 5 different categories: dc <-get.desc.categories() dc [1] "hybrid" "constitutional" "topological" [4] "electronic" "geometrical" Category 2 (constitutional), important in QSAR, contains 15 descriptors, which are listed below: [14].…”
Section: Computation Of the Molecular Descriptors (Physicochemical Prmentioning
confidence: 99%
See 1 more Smart Citation
“…The number of atoms: It is also possible to calculate the coordinates for all the atoms present in the molecule: coords <-do.call('rbind' , lapply(atoms, get.point2d)) coords R can compute a set of molecular descriptors, grouped into 5 different categories: dc <-get.desc.categories() dc [1] "hybrid" "constitutional" "topological" [4] "electronic" "geometrical" Category 2 (constitutional), important in QSAR, contains 15 descriptors, which are listed below: [14].…”
Section: Computation Of the Molecular Descriptors (Physicochemical Prmentioning
confidence: 99%
“…It measures, for each point M i , the mean distance to each cluster, and the mean distance to the other points in its cluster. Silhouette values range between − 1 and 1 [4]. A Silhouette coefficient with a value near +1 indicates that the point is far from its neighbouring cluster and very close to the cluster to which it is assigned.…”
Section: Statistics For K-means Clusteringmentioning
confidence: 99%
“…(2012) [23] They represent a method for checking the usefulness of k-means on biological data. They introduce preprocessor schemes which automatically initialize a reasonable value of k to k-mean algorithm.…”
Section: Area Of Contributionmentioning
confidence: 99%
“…false(normalvfalse) The algorithm cannot deal with the high-dimensional data effectively. Literatures [16, 17] failed to solve the fusion problem of K -means and dimension reduction; Napoleon and Pavalakodi proposed PCA-Km algorithm [18], which applied PCA on original data set and obtained a reduced data set containing possibly uncorrelated variables; Ding and He proposed a coherent framework to adaptively select the most discriminative subspace [19].…”
Section: Related Workmentioning
confidence: 99%