2019
DOI: 10.1007/s41019-019-0091-y
|View full text |Cite
|
Sign up to set email alerts
|

Estimating the Optimal Number of Clusters k in a Dataset Using Data Depth

Abstract: This paper proposes a new method called depth difference (DeD), for estimating the optimal number of clusters (k) in a dataset based on data depth. The DeD method estimates the k parameter before actual clustering is constructed. We define the depth within clusters, depth between clusters, and depth difference to finalize the optimal value of k, which is an input value for the clustering algorithm. The experimental comparison with the leading state-of-the-art alternatives demonstrates that the proposed DeD met… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
27
0
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 72 publications
(28 citation statements)
references
References 16 publications
0
27
0
1
Order By: Relevance
“…After collecting and transforming data, we used the method of selecting a number of clusters [26] to get the appropriate number of clusters K = 5. The obtained experiment on EFM results is illustrated in Table 4 with five clusters such as: C1, C2, C3, C4 and C5 below: From the clustering results in Table 4, we could see that the blended learning method was chosen by the majority of students in all different levels of training.…”
Section: Experiments On Efm Resultsmentioning
confidence: 99%
“…After collecting and transforming data, we used the method of selecting a number of clusters [26] to get the appropriate number of clusters K = 5. The obtained experiment on EFM results is illustrated in Table 4 with five clusters such as: C1, C2, C3, C4 and C5 below: From the clustering results in Table 4, we could see that the blended learning method was chosen by the majority of students in all different levels of training.…”
Section: Experiments On Efm Resultsmentioning
confidence: 99%
“…Determining the optimal number of clusters in a data set is a fundamental problem in partitioning clustering, such as K-means clustering, which allows the user to define the number of clusters K to be generated. The possible number of clusters is rather arbitrary and is determined by the method used for measuring similarities and the parameters used for partitioning [1]. There are many clustering algorithms used for the group the similar objects in many domains such as the Medical domain, Education domain, Governance domain, etc.…”
Section: Introductionmentioning
confidence: 99%
“…Another challenge to be addressed in future works is to improve the way the optimal number of superpixels is extracted. In this research, we selected the number of superpixels to minimize the intra-cluster distance, but more sophisticated methods can be used in the future to reduce the number of superpixels while retaining the most important information from each image [36]. In this research, we overestimated the number of superpixel centroids in order to ensure proper extraction of spectral features from the HS image.…”
Section: Discussionmentioning
confidence: 99%