Community detection is a fundamental problem in network analysis, with applications in many diverse areas. The stochastic block model is a common tool for model-based community detection, and asymptotic tools for checking consistency of community detection under the block model have been recently developed. However, the block model is limited by its assumption that all nodes within a community are stochastically equivalent, and provides a poor fit to networks with hubs or highly varying node degrees within communities, which are common in practice. The degree-corrected stochastic block model was proposed to address this shortcoming and allows variation in node degrees within a community while preserving the overall block community structure. In this paper we establish general theory for checking consistency of community detection under the degree-corrected stochastic block model and compare several community detection criteria under both the standard and the degreecorrected models. We show which criteria are consistent under which models and constraints, as well as compare their relative performance in practice. We find that methods based on the degree-corrected block model, which includes the standard block model as a special case, are consistent under a wider class of models and that modularitytype methods require parameter constraints for consistency, whereas likelihood-based methods do not. On the other hand, in practice, the degree correction involves estimating many more parameters, and empirically we find it is only worth doing if the node degrees within communities are indeed highly variable. We illustrate the methods on simulated networks and on a network of political blogs. This note corrects an error in two related proofs of consistency of community detection: under stochastic block models by Bickel and Chen [Proc. Natl. Acad. Sci. USA 106 (2009) 21068-21073] and under degree-corrected stochastic block model by Zhao, Levina and Zhu [Ann. Statist. 40 (2012) 2266-2292].
Analysis of networks and in particular discovering communities within networks has been a focus of recent work in several fields and has diverse applications. Most community detection methods focus on partitioning the entire network into communities, with the expectation of many ties within communities and few ties between. However, many networks contain nodes that do not fit in with any of the communities, and forcing every node into a community can distort results. Here we propose a new framework that extracts one community at a time, allowing for arbitrary structure in the remainder of the network, which can include weakly connected nodes. The main idea is that the strength of a community should depend on ties between its members and ties to the outside world, but not on ties between nonmembers. The proposed extraction criterion has a natural probabilistic interpretation in a wide class of models and performs well on simulated and real networks. For the case of the block model, we establish asymptotic consistency of estimated node labels and propose a hypothesis test for determining the number of communities. U nderstanding and modeling network structures have been a focus of attention in a number of diverse fields, including physics, biology, computer science, statistics, and social sciences. Applications of network analysis include friendship and social networks, marketing and recommender systems, the World Wide Web, disease models, and food webs, among others. A fundamental problem in the study of networks is community detection (see refs. 1-3 for comprehensive recent reviews). We focus here on undirected networks N ¼ ðV ;EÞ, where V is the set of nodes and E is the set of edges, possibly weighted. The community detection problem is typically formulated as finding the partition V ¼ V 1 ∪…∪V K , which gives the "best" communities in some suitable sense. The node sets V 1 ;…;V K are usually taken to be disjoint, although there is some recent work on detecting overlapping communities (4, 5).The extensive physics and computer science literature on networks typically thinks of communities as tightly knit groups with many connections between the group members and relatively few connections between groups. Thus detection methods focus on maximizing links within communities while minimizing links between communities. This can be achieved either implicitly through an algorithmic approach (6) or explicitly by optimizing a criterion that measures the quality of a proposed partition over all possible partitions. These criteria include ratio cuts (7), normalized cuts (8), spectral clustering (9), and modularity (10); see ref. 3 for a review. All of these are symmetric criteria, in the sense that all potential communities play the same role. There are many examples of networks where such a requirement makes sense, for example, the college football games network (11), and yet some commonly studied networks clearly do not fit this framework. One such example is when there are nodes without strong connections to any commun...
Although using anammox communities for efficient wastewater treatment has attracted much attention, the pure anammox bacteria are difficult to obtain, and the potential roles of symbiotic bacteria in anammox performance are still elusive. Here, we combined long-term reactor operation, genome-centered metagenomics, community functional structure, and metabolic pathway reconstruction to reveal multiple potential cross-feedings during anammox reactor start-up according to the 37 recovered metagenome-assembled genomes (MAGs). We found Armatimonadetes and Proteobacteria could contribute the secondary metabolites molybdopterin cofactor and folate for anammox bacteria to benefit their activity and growth. Chloroflexi-affiliated bacteria encoded the function of biosynthesizing exopolysaccharides for anammox consortium aggregation, based on the partial nucleotide sugars produced by anammox bacteria. Chlorobi-affiliated bacteria had the ability to degrade extracellular proteins produced by anammox bacteria to amino acids to affect consortium aggregation. Additionally, the Chloroflexi-affiliated bacteria harbored genes for a nitrite loop and could have a dual role in anammox performance during reactor start-up. Cross-feeding in anammox community adds a different dimension for understanding microbial interactions and emphasizes the importance of symbiotic bacteria in the anammox process for wastewater treatment.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.