Abstract-In this paper, we first discuss the definition of modularity (Q) used as a metric for community quality and then we review the modularity maximization approaches which were used for community detection in the last decade. Then, we discuss two opposite yet coexisting problems of modularity optimization: in some cases, it tends to favor small communities over large ones while in others, large communities over small ones (so called the resolution limit problem). Next, we overview several community quality metrics proposed to solve the resolution limit problem and discuss Modularity Density (Q ds ) which simultaneously avoids the two problems of modularity. Finally, we introduce two novel fine-tuned community detection algorithms that iteratively attempt to improve the community quality measurements by splitting and merging the given network community structure. The first of them, referred to as Fine-tuned Q, is based on modularity (Q) while the second one is based on Modularity Density (Q ds ) and denoted as Fine-tuned Q ds . Then, we compare the greedy algorithm of modularity maximization (denoted as Greedy Q), Fine-tuned Q, and Fine-tuned Q ds on four real networks, and also on the classical clique network and the LFR benchmark networks, each of which is instantiated by a wide range of parameters. The results indicate that Fine-tuned Q ds is the most effective among the three algorithms discussed. Moreover, we show that Fine-tuned Q ds can be applied to the communities detected by other algorithms to significantly improve their results.
Biological functions are carried out by groups of interacting molecules, cells or tissues, known as communities. Membership in these communities may overlap when biological components are involved in multiple functions. However, traditional clustering methods detect non-overlapping communities. These detected communities may also be unstable and difficult to replicate, because traditional methods are sensitive to noise and parameter settings. These aspects of traditional clustering methods limit our ability to detect biological communities, and therefore our ability to understand biological functions. To address these limitations and detect robust overlapping biological communities, we propose an unorthodox clustering method called SpeakEasy which identifies communities using top-down and bottom-up approaches simultaneously. Specifically, nodes join communities based on their local connections, as well as global information about the network structure. This method can quantify the stability of each community, automatically identify the number of communities, and quickly cluster networks with hundreds of thousands of nodes. SpeakEasy shows top performance on synthetic clustering benchmarks and accurately identifies meaningful biological communities in a range of datasets, including: gene microarrays, protein interactions, sorted cell populations, electrophysiology and fMRI brain imaging.
Abstract-Modularity is widely used to effectively measure the strength of the disjoint community structure found by community detection algorithms. Although several overlapping extensions of modularity were proposed to measure the quality of overlapping community structure, there is lack of systematic comparison of different extensions. To fill this gap, we overview overlapping extensions of modularity to select the best. In addition, we extend the Modularity Density metric to enable its usage for overlapping communities. The experimental results on four real networks using overlapping extensions of modularity, overlapping modularity density, and six other community quality metrics show that the best results are obtained when the product of the belonging coefficients of two nodes is used as the belonging function. Moreover, our experiments indicate that overlapping modularity density is a better measure of the quality of overlapping community structure than other metrics considered.
Abstract-Social networks consist of various communities that host members sharing common characteristics. Often some members of one community are also members of other communities. Such shared membership of different communities leads to overlapping communities. Detecting such overlapping communities is a challenging and computationally intensive problem. In this paper, we investigate the usability of high performance computing in the area of social networks and community detection. We present highly scalable variants of a community detection algorithm called Speaker-listener Label Propagation Algorithm (SLPA). We show that despite of irregular data dependencies in the computation, parallel computing paradigms can significantly speed up the detection of overlapping communities of social networks which is computationally expensive. We show by experiments, how various parallel computing architectures can be utilized to analyze large social network data on both shared memory machines and distributed memory machines, such as IBM Blue Gene.
Modularity maximization is one of the state-of-the-art methods for community detection that has gained popularity in the last decade. Yet it suffers from the resolution limit problem by preferring under certain conditions large communities over small ones. To solve this problem, we propose to expand the meaning of the edges that are currently used to indicate propensity of nodes for sharing the same community. In our approach this is the role of edges with positive weights while edges with negative weights indicate aversion for putting their end-nodes into one community. We also present a novel regression model which assigns weights to the edges of a graph according to their local topological features to enhance the accuracy of modularity maximization algorithms. We construct artificial graphs based on the parameters sampled from a given unweighted network and train the regression model on ground truth communities of these artificial graphs in a supervised fashion. The extraction of local topological edge features can be done in linear time, making this process efficient.Experimental results on real and synthetic networks show that the state-of-theart community detection algorithms improve their performance significantly by finding communities in the weighted graphs produced by our model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.