Stochastic blockmodels have been proposed as a tool for detecting community structure in networks as well as for generating synthetic networks for use as benchmarks. Most blockmodels, however, ignore variation in vertex degree, making them unsuitable for applications to real-world networks, which typically display broad degree distributions that can significantly distort the results. Here we demonstrate how the generalization of blockmodels to incorporate this missing element leads to an improved objective function for community detection in complex networks. We also propose a heuristic algorithm for community detection using this objective function or its non-degree-corrected counterpart and show that the degree-corrected version dramatically outperforms the uncorrected one in both real-world and synthetic networks.
A fundamental problem in the analysis of network data is the detection of network communities, groups of densely interconnected nodes, which may be overlapping or disjoint. Here we describe a method for finding overlapping communities based on a principled statistical approach using generative network models. We show how the method can be implemented using a fast, closed-form expectation-maximization algorithm that allows us to analyze networks of millions of nodes in reasonable running times. We test the method both on real-world networks and on synthetic benchmarks and find that it gives results competitive with previous methods. We also show that the same approach can be used to extract nonoverlapping community divisions via a relaxation method, and demonstrate that the algorithm is competitively fast and accurate for the nonoverlapping problem.
In most models of the spread of disease over contact networks it is assumed that the probabilities per unit time of disease transmission and recovery from disease are constant, implying exponential distributions of the time intervals for transmission and recovery. Time intervals for real diseases, however, have distributions that in most cases are far from exponential, which leads to disagreements, both qualitative and quantitative, with the models. In this paper, we study a generalized version of the susceptible-infected-recovered model of epidemic disease that allows for arbitrary distributions of transmission and recovery times. Standard differential equation approaches cannot be used for this generalized model, but we show that the problem can be reformulated as a time-dependent message passing calculation on the appropriate contact network. The calculation is exact on trees (i.e., loopless networks) or locally treelike networks (such as random graphs) in the large system size limit. On non-tree-like networks we show that the calculation gives a rigorous bound on the size of disease outbreaks. We demonstrate the method with applications to two specific models and the results compare favorably with numerical simulations.
We study percolation on networks, which is used as a model of the resilience of networked systems such as the Internet to attack or failure and as a simple model of the spread of disease over human contact networks. We reformulate percolation as a message passing process and demonstrate how the resulting equations can be used to calculate, among other things, the size of the percolating cluster and the average cluster size. The calculations are exact for sparse networks when the number of short loops in the network is small, but even on networks with many short loops we find them to be highly accurate when compared with direct numerical simulations. By considering the fixed points of the message passing process, we also show that the percolation threshold on a network with few loops is given by the inverse of the leading eigenvalue of the so-called non-backtracking matrix.Percolation, the random occupation of sites or bonds on a lattice or network with independent probability p, is one of the best-studied processes in statistical physics. It is used as a model of porous media [1,2], granular and composite materials [3][4][5][6], resistor networks [7], forest fires [8], and many other systems of scientific interest. In this paper we study the bond (or edge) percolation process on general networks or graphs, which is used to model the spread of disease [9,10] and network robustness [11][12][13] in social and technological networks, among other things. Although percolation has been studied extensively on simple model networks such as random graphs [11,12,14,15], there are few analytic results for real-world networks, whose structure is typically more complicated. We show that percolation properties of networks can be calculated using a message passing technique, leading to a range of new results. In particular, we derive equations for the size of the percolating cluster and the average size of non-percolating clusters, which can be solved rapidly by numerical iteration given the structure of a network and the value of p. By expanding the message passing equations about the critical point we also derive an expression for the position of the percolation threshold, showing that the critical value of p is given by the inverse of the leading eigenvalue of the so-called non-backtracking matrix [16,17], an edge-based matrix representation of network structure that has found recent use in studies of community detection and centrality in networks [17,18]. The quantities we calculate are averages over all possible realizations of the randomness inherent in the percolation process, rather than over a single realization, obviating the need for a separate average over realizations as is typically required in direct numerical simulations.We focus in particular on sparse networks, those for which only a small fraction of possible edges are present, which includes most real-world networks. Our results are exact for large, sparse networks that contain a vanishing density of short loops, but even for networks that do contain loops, ...
The discovery of community structure is a common challenge in the analysis of network data. Many methods have been proposed for finding community structure, but few have been proposed for determining whether the structure found is statistically significant or whether, conversely, it could have arisen purely as a result of chance. In this paper we show that the significance of community structure can be effectively quantified by measuring its robustness to small perturbations in network structure. We propose a suitable method for perturbing networks and a measure of the resulting change in community structure and use them to assess the significance of community structure in a variety of networks, both real and computer generated.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.