Clustering a graph, i.e., assigning its nodes to groups, is an important operation whose best known application is the discovery of communities in social networks. Graph clustering and community detection have traditionally focused on graphs without attributes, with the notable exception of edge weights. However, these models only provide a partial representation of real social systems, that are thus often described using node attributes, representing features of the actors, and edge attributes, representing different kinds of relationships among them. We refer to these models as attributed graphs. Consequently, existing graph clustering methods have been recently extended to deal with node and edge attributes. This article is a literature survey on this topic, organizing and presenting recent research results in a uniform way, characterizing the main existing clustering methods and highlighting their conceptual differences. We also cover the important topic of clustering evaluation and identify current open problems.
Discovering community structure in complex networks is a mature field since a tremendous number of community detection methods have been introduced in the literature. Nevertheless, it is still very challenging for practitioners to determine which method would be suitable to get insights into the structural information of the networks they study. Many recent efforts have been devoted to investigating various quality scores of the community structure, but the problem of distinguishing between different types of communities is still open. In this paper, we propose a comparative, extensive, and empirical study to investigate what types of communities many state-of-the-art and well-known community detection methods are producing. Specifically, we provide comprehensive analyses on computation time, community size distribution, a comparative evaluation of methods according to their optimization schemes as well as a comparison of their partitioning strategy through validation metrics. We process our analyses on a very large corpus of hundreds of networks from five different network categories and propose ways to classify community detection methods, helping a potential user to navigate the complex landscape of community detection.
Social network analysis has become a major subject in recent times, bringing also several challenges in the computer science field. One aspect of the social network analysis is the community detection problem, which is seen as a graph clustering problem. However, social networks are more than a graph, they have an interesting amount of information derived from its social aspect, such as profile information, content sharing and annotations, among others. Most of the community detection algorithms use only the structure of the network, i.e., the graph. In this paper we propose a new method which uses the semantic information along with the network structure in the community detection process. Thus, our method combines an algorithm for optimizing modularity and an entropy-based data clustering algorithm, which tries to find a partition with low entropy and keeping in mind the modularity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.