2023
DOI: 10.1007/s13369-023-07741-9
|View full text |Cite
|
Sign up to set email alerts
|

Understanding the Interplay Between Metrics, Normalization Forms, and Data distribution in K-Means Clustering: A Comparative Simulation Study

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
6
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(7 citation statements)
references
References 34 publications
1
6
0
Order By: Relevance
“…Similar studies were performed regarding the simple-effect of the normalization by number of researches such as [27] and [28], and other were dedicated to the simple effect of the clusters' shapes such as Kłopotek et al 2020 which proved kmeans clusters should be hyper(ball)-shaped ones to converge to the global optimum [25], similar statement was proposed by Qiu 2010 [29]. The results of Kłopotek et al 2020 [25] are in total concordance with those of El Khattabi et al 2022 [26] since it was found that normal (Gaussian) standardization are well adapted to Gaussian data-shapes, named in the EL Khattabi's paper as Likely-Gaussian datasets, and in Kłopotek's paper as hyper(ball)-shaped data. Similarly, Hennig 2022 studied nine clustering methods by means of several cluster validation indexes [30], The author measured various individual aspects of the data sets such the scales of data, the clusters separation criterion, and the datasets shapes as mainly the closeness to spatial Gaussian distribution, and so forth.…”
Section: Introductionsupporting
confidence: 65%
See 4 more Smart Citations
“…Similar studies were performed regarding the simple-effect of the normalization by number of researches such as [27] and [28], and other were dedicated to the simple effect of the clusters' shapes such as Kłopotek et al 2020 which proved kmeans clusters should be hyper(ball)-shaped ones to converge to the global optimum [25], similar statement was proposed by Qiu 2010 [29]. The results of Kłopotek et al 2020 [25] are in total concordance with those of El Khattabi et al 2022 [26] since it was found that normal (Gaussian) standardization are well adapted to Gaussian data-shapes, named in the EL Khattabi's paper as Likely-Gaussian datasets, and in Kłopotek's paper as hyper(ball)-shaped data. Similarly, Hennig 2022 studied nine clustering methods by means of several cluster validation indexes [30], The author measured various individual aspects of the data sets such the scales of data, the clusters separation criterion, and the datasets shapes as mainly the closeness to spatial Gaussian distribution, and so forth.…”
Section: Introductionsupporting
confidence: 65%
“…In a previous work, the authors of the present paper experimentally proved the importance of data preparation in terms of normalization, and the importance of the data dispersion which was qualified as space data shape, then, these two characteristics were combined with different kmeans metrics for a series of datasets. The findings clearly showed the tri-fold interplay between the latter parameters but also the important sensitivity of these latter on the clustering results [26]. Similar studies were performed regarding the simple-effect of the normalization by number of researches such as [27] and [28], and other were dedicated to the simple effect of the clusters' shapes such as Kłopotek et al 2020 which proved kmeans clusters should be hyper(ball)-shaped ones to converge to the global optimum [25], similar statement was proposed by Qiu 2010 [29].…”
Section: Introductionmentioning
confidence: 88%
See 3 more Smart Citations