2022
DOI: 10.3233/faia220335
|View full text |Cite
|
Sign up to set email alerts
|

Bootstrap-CURE Clustering: An Investigation of Impact of Shrinking on Clustering Performance

Abstract: Hierarchical clustering is one of the most popular techniques in unsupervised segmentation. However, since it has quadratic complexity as it is based on pairwise distance matrix construction, it tends to be less used with really large data cases. CURE clustering tackles this challenge by accelerating the process through a first hierarchical clustering over a smaller sample from which a set of representative points of resulting clusters is obtained and used to estimate the cluster shape. A KNN process with thos… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2023
2023

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 11 publications
(19 reference statements)
0
1
0
Order By: Relevance
“…As the current study has an exploratory design, we first conducted a hierarchical cluster analysis based on the Z scores for all scales using the total sample, using the Ward's method with Euclidean distance. Ward's method was suggested to be more appropriate for various types of data structures compared to other hierarchical algorithms [45], and the Euclidean distance, a commonly used distance measure, is known to be more suitable for numerical variables [46,47]. Overall, the hierarchical cluster analysis is used to disclose the naturally occurring subgroups in the sample that are homogenous with regards to highly similar observations they contain, yet significantly different from each other [48].…”
Section: Discussionmentioning
confidence: 99%
“…As the current study has an exploratory design, we first conducted a hierarchical cluster analysis based on the Z scores for all scales using the total sample, using the Ward's method with Euclidean distance. Ward's method was suggested to be more appropriate for various types of data structures compared to other hierarchical algorithms [45], and the Euclidean distance, a commonly used distance measure, is known to be more suitable for numerical variables [46,47]. Overall, the hierarchical cluster analysis is used to disclose the naturally occurring subgroups in the sample that are homogenous with regards to highly similar observations they contain, yet significantly different from each other [48].…”
Section: Discussionmentioning
confidence: 99%