Ryan Gibson scite author profile

We introduce the Convex Hull of Admissible Modularity Partitions (CHAMP) algorithm to prune and prioritize different network community structures identified across multiple runs of possibly various computational heuristics. Given a set of partitions, CHAMP identifies the domain of modularity optimization for each partition—i.e., the parameter-space domain where it has the largest modularity relative to the input set—discarding partitions with empty domains to obtain the subset of partitions that are “admissible” candidate community structures that remain potentially optimal over indicated parameter domains. Importantly, CHAMP can be used for multi-dimensional parameter spaces, such as those for multilayer networks where one includes a resolution parameter and interlayer coupling. Using the results from CHAMP, a user can more appropriately select robust community structures by observing the sizes of domains of optimization and the pairwise comparisons between partitions in the admissible subset. We demonstrate the utility of CHAMP with several example networks. In these examples, CHAMP focuses attention onto pruned subsets of admissible partitions that are 20-to-1785 times smaller than the sets of unique partitions obtained by community detection heuristics that were input into CHAMP.

show abstract

Concurrency and reachability in treelike temporal networks

Lee

Emmons

Gibson

et al. 2019

Phys. Rev. E

View full text Add to dashboard Cite

Network properties govern the rate and extent of various spreading processes, from simple contagions to complex cascades. Recently, the analysis of spreading processes has been extended from static networks to temporal networks, where nodes and links appear and disappear. We focus on the effects of "accessibility", whether there is a temporally consistent path from one node to another, and "reachability", the density of the corresponding "accessibility graph" representation of the temporal network. The level of reachability thus inherently limits the possible extent of any spreading process on the temporal network. We study reachability in terms of the overall levels of temporal concurrency between edges and the structural cohesion of the network agglomerating over all edges. We use simulation results and develop heterogeneous mean field model predictions for random networks to better quantify how the properties of the underlying temporal network regulate reachability.

show abstract

FastPG: Fast clustering of millions of single cells

Bodenheimer

Halappanavar

Jefferys

et al. 2020

Preprint

View full text Add to dashboard Cite

Current single-cell experiments can produce datasets with millions of cells. Unsupervised clustering can be used to identify cell populations in single-cell analysis but often leads to interminable computation time at this scale. This problem has previously been mitigated by subsampling cells, which greatly reduces accuracy. We built on the graph-based algorithm PhenoGraph and developed FastPG which has the same cell assignment accuracy but is on average 27x faster in our tests. FastPG also outperforms two other fast clustering methods, FlowSOM and PARC.Mass cytometry measures proteins abundances at the single-cell level. This technology can measure 100,000-200,000 cells per sample and up to around 40 protein markers simultaneously 1 . Unsupervised clustering is a common task in single-cell data analysis, with the goal of unbiasedly identifying known and unknown cell-types based on protein markers. Clustering cells across multiple samples can be optimal for some experiments to gain a deeper understanding of how a particular cell population changes in a disease state. Most clustering algorithms cannot efficiently handle a very large number of collected cells. These algorithms are severely restricted by computational time and available system memory. Common approaches to mitigate these issues include subsampling and meta-clustering 2-4 . The disadvantage to both is loss of information and lower potential to discover rare cell types 4 .A commonly used and robust method for mass cytometry data is PhenoGraph 5 , which uses a graph-based approach to unbiasedly identify clusters, or cell types. Here we present FastPG (Figure 1), a modified version of PhenoGraph which was developed to maximize computational efficiency with no loss of cell assignment accuracy. PhenoGraph is composed of three main steps: (1) creating a k-nearest neighbor network (kNN) of single cells using a distance measurement calculated from their protein marker abundances, (2) adding weights to the network through calculating Jaccard index, and (3) partitioning cells into coherent cell populations using the Louvain algorithm 6 . We modified PhenoGraph as follows: first, we replaced the kNN step with a fast kNN approximation, Hierarchical Navigable Small World (HNSW), which uses logarithmic scaling due to the hierarchical structure of the search space 7 . We next parallelized the Jaccard index step for multithreaded execution, and lastly, we replaced the Louvain algorithm with a fast parallelized version, Grappolo 8 .To test for cell assignment accuracy, we benchmarked FastPG against PhenoGraph, PARC 4 , and FlowSOM 2 . We compared FastPG to PhenoGraph to ensure our method was getting similar accuracy, and FastPG to PARC and FlowSOM due to the published fast speed of these algorithms. Briefly, PARC's first step employs HNSW, the same kNN approximation as FastPG. PARC next uses a graph-pruning method followed by Leiden 9 for community detection.

show abstract

Finite-state parameter space maps for pruning partitions in modularity-based community detection

Gibson

Mucha

2022

Sci Rep

View full text Add to dashboard Cite

Partitioning networks into communities of densely connected nodes is an important tool used widely across different applications, with numerous methods and software packages available for community detection. Modularity-based methods require parameters to be selected (or assume defaults) to control the resolution and, in multilayer networks, interlayer coupling. Meanwhile, most useful algorithms are heuristics yielding different near-optimal results upon repeated runs (even at the same parameters). To address these difficulties, we combine recent developments into a simple-to-use framework for pruning a set of partitions to a subset that are self-consistent by an equivalence with the objective function for inference of a degree-corrected planted partition stochastic block model (SBM). Importantly, this combined framework reduces some of the problems associated with the stochasticity that is inherent in the use of heuristics for optimizing modularity. In our examples, the pruning typically highlights only a small number of partitions that are fixed points of the corresponding map on the set of somewhere-optimal partitions in the parameter space. We also derive resolution parameter upper bounds for fitting a constrained SBM of K blocks and demonstrate that these bounds hold in practice, further guiding parameter space regions to consider. With publicly available code (http://github.com/ragibson/ModularityPruning), our pruning procedure provides a new baseline for using modularity-based community detection in practice.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Ryan Gibson

Post-Processing Partitions to Identify Domains of Modularity Optimization

Concurrency and reachability in treelike temporal networks

FastPG: Fast clustering of millions of single cells

Finite-state parameter space maps for pruning partitions in modularity-based community detection

Contact Info

Product

Resources

About