Disa Mhembere scite author profile

The connectivity of the human brain is fundamental to understanding the principles of cognitive function, and the mechanisms by which it can go awry. To that extent, tools for estimating human brain networks are required for single subject, group level, and cross-study analyses. We have developed an open-source, cloud-enabled, turn-key pipeline that operates on (groups of) raw di usion and structure magnetic resonance imaging data, estimating brain networks (connectomes) across 24 di erent spatial scales, with quality assurance visualizations at each stage of processing. Running a harmonized analysis on 10 di erent datasets comprising 2,295 subjects and 2,861 scans reveals that the connectomes across datasets are similar on coarse scales, but quantitatively di erent on fine scales. Our framework therefore illustrates that while general principles of human brain organization may be preserved across experiments, obtaining reliable p-values and clinical biomarkers from connectomics will require further harmonization e orts.

show abstract

MIGRAINE: MRI Graph Reliability Analysis and Inference for Connectomics

Roncal

Koterba

Mhembere

et al. 2013

View full text Add to dashboard Cite

Currently, connectomes (e.g., functional or structural brain graphs) can be estimated in humans at $\approx 1~mm^3$ scale using a combination of diffusion weighted magnetic resonance imaging, functional magnetic resonance imaging and structural magnetic resonance imaging scans. This manuscript summarizes a novel, scalable implementation of open-source algorithms to rapidly estimate magnetic resonance connectomes, using both anatomical regions of interest (ROIs) and voxel-size vertices. To assess the reliability of our pipeline, we develop a novel nonparametric non-Euclidean reliability metric. Here we provide an overview of the methods used, demonstrate our implementation, and discuss available user extensions. We conclude with results showing the efficacy and reliability of the pipeline over previous state-of-the-art.Comment: Published as part of 2013 IEEE GlobalSIP conferenc

show abstract

Semi-External Memory Sparse Matrix Multiplication for Billion-Node Graphs

Zheng

Mhembere

Lyzinski

et al. 2017

IEEE Trans. Parallel Distrib. Syst.

View full text Add to dashboard Cite

Abstract-Sparse matrix multiplication is traditionally performed in memory and scales to large matrices using the distributed memory of multiple nodes. In contrast, we scale sparse matrix multiplication beyond memory capacity by implementing sparse matrix dense matrix multiplication (SpMM) in a semi-external memory (SEM) fashion; i.e., we keep the sparse matrix on commodity SSDs and dense matrices in memory. Our SEM-SpMM incorporates many in-memory optimizations for large power-law graphs. It outperforms the in-memory implementations of Trilinos and Intel MKL and scales to billion-node graphs, far beyond the limitations of memory. Furthermore, on a single large parallel machine, our SEM-SpMM operates as fast as the distributed implementations of Trilinos using five times as much processing power. We also run our implementation in memory (IM-SpMM) to quantify the overhead of keeping data on SSDs. SEM-SpMM achieves almost 100% performance of IM-SpMM on graphs when the dense matrix has more than four columns; it achieves at least 65% performance of IM-SpMM on all inputs. We apply our SpMM to three important data analysis tasks-PageRank, eigensolving, and non-negative matrix factorization-and show that our SEM implementations significantly advance the state of the art.

show abstract

Computing scalable multivariate glocal invariants of large (brain-) graphs

Mhembere

Roncal

Sussman

et al. 2013

View full text Add to dashboard Cite

Graphs are quickly emerging as a leading ab straction for the representation of data. One important appli cation domain originates from an emerging discipline called "connectomics". Connectomics studies the brain as a graph; vertices correspond to neurons (or collections thereof) and edges correspond to structural or functional connections between them.To explore the variability of connectomes-to address both basic science questions regarding the structure of the brain, and medical health questions about psychiatry and neurology-one can study the topological properties of these brain-graphs. We define multivariate glocal graph invariants: these are features of the graph that capture various local and global topological properties of the graphs. We show that the collection of features can collectively be computed via a combination of daisy-chaining, sparse matrix representation and computations, and efficient approximations. Our custom open-source Python package serves as a back-end to a Web-service that we have created to enable researchers to upload graphs, and download the corresponding invariants in a number of different formats. Moreover, we built this package to support distributed processing on muIticore machines. This is therefore an enabling technology for network science, lowering the barrier of entry by providing tools to biologists and analysts who otherwise lack these capabilities. As a demonstration, we run our code on 120 brain-graphs, each with approximately 16M vertices and up to 90M edges.

show abstract

Forest Packing: Fast Parallel, Decision Forests

Browne¹,

Tomita²,

Mhembere³

et al. 2019

View full text Add to dashboard Cite

Machine learning has an emerging critical role in high-performance computing to modulate simulations, extract knowledge from massive data, and replace numerical models with efficient approximations. Decision forests are a critical tool because they provide insight into model operation that is critical to interpreting learned results. While decision forests are trivially parallelizable, the traversals of tree data structures incur many random memory accesses and are very slow.We present memory packing techniques that reorganize learned forests to minimize cache misses during classification. The resulting layout is hierarchical. At low levels, we pack the nodes of multiple trees into contiguous memory blocks so that each memory access fetches data for multiple trees. At higher levels, we use leaf cardinality to identify the most popular paths through a tree and collocate those paths in cache lines. We extend this layout with out-of-order execution and cache-line prefetching to increase memory throughput.Together, these optimizations increase the performance of classification in ensembles by a factor of four over an optimized C++ implementation and a factor of 50 over a popular Rlanguage implementation.

show abstract

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Disa Mhembere

A High-Throughput Pipeline Identifies Robust Connectomes But Troublesome Variability

MIGRAINE: MRI Graph Reliability Analysis and Inference for Connectomics

Semi-External Memory Sparse Matrix Multiplication for Billion-Node Graphs

Computing scalable multivariate glocal invariants of large (brain-) graphs

Forest Packing: Fast Parallel, Decision Forests

Contact Info

Product

Resources

About