2020
DOI: 10.1093/comnet/cnaa012
|View full text |Cite
|
Sign up to set email alerts
|

A density-based statistical analysis of graph clustering algorithm performance

Abstract: We introduce graph clustering quality measures based on comparisons of global, intra- and inter-cluster densities, an accompanying statistical significance test and a step-by-step routine for clustering quality assessment. Our work is centred on the idea that well-clustered graphs will display a mean intra-cluster density that is higher than global density and mean inter-cluster density. We do not rely on any generative model for the null model graph. Our measures are shown to meet the axioms of a good cluster… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
21
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
2
2
1

Relationship

4
4

Authors

Journals

citations
Cited by 14 publications
(22 citation statements)
references
References 47 publications
1
21
0
Order By: Relevance
“…In a second step, we determined the relatedness of Level 1 responses by calculating a weighted Jaccard similarity between them (Ioffe, 2010), which has been found to perform well relative to other similarity measures in clustering approaches (e.g., Huang et al, 2008;Saad & Kamarudin, 2013;Strehl et al, 2000). In a third step, we represented the similarity matrix of Level 1 responses as a weighted network and extracted the components of risk using the Louvain modularity algorithm (Blondel et al, 2008), which has been found to compare favorably to other modularity and clustering algorithms (e.g., Emmons et al, 2016;Miasnikof et al, 2020;Pradana et al, 2020;Williams et al, 2019). One attractive feature of modularity detection algorithms, such as the Louvain algorithm, is that they also identify an optimal number of clusters.…”
Section: The Semantic Network Of Riskmentioning
confidence: 99%
“…In a second step, we determined the relatedness of Level 1 responses by calculating a weighted Jaccard similarity between them (Ioffe, 2010), which has been found to perform well relative to other similarity measures in clustering approaches (e.g., Huang et al, 2008;Saad & Kamarudin, 2013;Strehl et al, 2000). In a third step, we represented the similarity matrix of Level 1 responses as a weighted network and extracted the components of risk using the Louvain modularity algorithm (Blondel et al, 2008), which has been found to compare favorably to other modularity and clustering algorithms (e.g., Emmons et al, 2016;Miasnikof et al, 2020;Pradana et al, 2020;Williams et al, 2019). One attractive feature of modularity detection algorithms, such as the Louvain algorithm, is that they also identify an optimal number of clusters.…”
Section: The Semantic Network Of Riskmentioning
confidence: 99%
“…We then examine the relationship between mean Jaccard [10], Otsuka-Ochiai [17] and Burt's distances [2], on one hand, and intra-cluster density [14,13,15,16] within each cluster, on the other. Because these distances are pairwise measures, we compare their mean value for a given cluster to the cluster's internal density.…”
Section: Distance Measurements Under Studymentioning
confidence: 99%
“…At the cluster level, this distance takes the form of subsets of densely connected vertices. The link between clustering and density has been discussed in depth, recently [14,15,13,16]. In this article, our ultimate goal is to transform a graph's adjacency matrix into a |V | × |V | similarity or distance matrix D = [d ij ], where the distance between each pair of vertices is given by the element d ij .…”
Section: Introductionmentioning
confidence: 99%

Graph Distances and Clustering

Miasnikof,
Shestopaloff,
Pitsoulis
et al. 2020
Preprint
Self Cite
“…Our approach is more flexible and allows us to circumvent modularity's many shortcomings. These shortcomings have been well-documented in the literature (e.g., [11,1,29,30,31]). Furthermore, we choose to formulate our problem as a QUBO problem, in order to overcome computational intractability and benefit from new hardware developments.…”
Section: Introductionmentioning
confidence: 97%
“…Vertices that share more connections are defined as closer, more similar, to each other than to the ones with which they share fewer connections. Successful clustering results in vertices grouped into densely connected induced subgraphs (e.g., [29,30,31]). Figure 1 shows an example of a successful and an unsuccessful clustering.…”
Section: Introductionmentioning
confidence: 99%