Dysregulation of gene expression in Alzheimer’s disease (AD) remains elusive, especially at the cell type level. Gene regulatory network, a key molecular mechanism linking transcription factors (TFs) and regulatory elements to govern gene expression, can change across cell types in the human brain and thus serve as a model for studying gene dysregulation in AD. However, AD-induced regulatory changes across brain cell types remains uncharted. To address this, we integrated single-cell multi-omics datasets to predict the gene regulatory networks of four major cell types, excitatory and inhibitory neurons, microglia and oligodendrocytes, in control and AD brains. Importantly, we analyzed and compared the structural and topological features of networks across cell types and examined changes in AD. Our analysis shows that hub TFs are largely common across cell types and AD-related changes are relatively more prominent in some cell types (e.g., microglia). The regulatory logics of enriched network motifs (e.g., feed-forward loops) further uncover cell type-specific TF-TF cooperativities in gene regulation. The cell type networks are also highly modular and several network modules with cell-type-specific expression changes in AD pathology are enriched with AD-risk genes. The further disease-module-drug association analysis suggests cell-type candidate drugs and their potential target genes. Finally, our network-based machine learning analysis systematically prioritized cell type risk genes likely involved in AD. Our strategy is validated using an independent dataset which showed that top ranked genes can predict clinical phenotypes (e.g., cognitive impairment) of AD with reasonable accuracy. Overall, this single-cell network biology analysis provides a comprehensive map linking genes, regulatory networks, cell types and drug targets and reveals cell-type gene dysregulation in AD.
Genotype-phenotype association is found in many biological systems, such as brain-related diseases and behavioral traits. Despite the recent improvement in the prediction of phenotypes from genotypes, they can be further improved and explainability of these predictions remains challenging, primarily due to complex underlying molecular and cellular mechanisms. Emerging multimodal data enables studying such mechanisms at different scales from genotype to phenotypes involving intermediate phenotypes like gene expression. However, due to the black-box nature of many machine learning techniques, it is challenging to integrate these multi-modalities and interpret the biological insights in prediction, especially when some modality is missing. Biological knowledge has recently been incorporated into machine learning modeling to help understand the reasoning behind the choices made by these models. To this end, we developed DeepGAMI, an interpretable deep learning model to improve genotype-phenotype prediction from multimodal data. DeepGAMI uses prior biological knowledge to define the neural network architecture. Notably, it embeds an auxiliary-learning layer for cross-modal imputation while training the model from multimodal data. Using this pre-trained layer, we can impute latent features of additional modalities and thus enable predicting phenotypes from a single modality only. Finally, the model uses integrated gradient to prioritize multimodal features and links for phenotypes. We applied DeepGAMI to multiple emerging multimodal datasets: (1) population-level genotype and bulk-tissue gene expression data for predicting schizophrenia, (2) population-level genotype and gene expression data for predicting clinical phenotypes in Alzheimer's Disease, (3) gene expression and electrophysiological data of single neuronal cells in the mouse visual cortex, and (4) cell-type gene expression and genotype data for predicting schizophrenia. We found that DeepGAMI outperforms existing state-of-the-art methods and provides a profound understanding of gene regulatory mechanisms from genotype to phenotype, especially at cellular resolution.
Biological mechanisms are complex spanning multiple facets, each providing a unique view of the underlying mechanism. Recent single-cell technologies such as scRNA-seq and scATAC-seq have facilitated parallel probing into each of these facets, providing multimodal measurements of individual cells. An integrative analysis of these single-cell modalities can therefore provide a comprehensive understanding of specific cellular and molecular mechanisms. Despite the technological advances, simultaneous profiling of multiple modalities of single cells continues to be challenging as opposed to single modality measurements, and therefore, it may not always be feasible to conduct such profiling experiments (e.g., missing modalities). Furthermore, even with the available multimodalities, data integration remains elusive since modalities may not always have paired samples, leaving partial to no correspondence information. To address those challenges, we developed Cross-Modality Optimal Transport (CMOT), a computational approach to infer missing modalities of single cells based on optimal transport (OT). CMOT first aligns a group of cells (source) within available multi-modal data onto a common latent space. Then, it applies optimal transport to map the cells with a single modality (target) to the aligned cells in source from the same modality by minimizing their cost of transportation using Wasserstein distance. This distance can be regularized by prior knowledge (e.g., cell types) or induced cells clusters to improve mapping, and entropy of transport to speed up OT computations. Once transported, CMOT uses K-Nearest-Neighbors to infer the missing modality for the cells in target from another modality of mapped cells in source. Furthermore, the alignment of CMOT works for partially corresponding information from multi-modalities, i.e., allowing a fraction of unmatched cells in source. We evaluated CMOT on several emerging multi-modal datasets, e.g., gene and protein expression and chromatin accessibility in developing brain, cancers, and immunology. We found that not only does CMOT outperform existing state-of-art methods, but its inferred gene expression is biologically interpretable such as classifying cell types and cancer types.
Dysregulation of gene expression in Alzheimer's disease (AD) remains elusive, especially at the cell type level. Gene regulatory network, a key molecular mechanism linking transcription factors (TFs) and regulatory elements to govern target gene expression, can change across cell types in the human brain and thus serve as a model for studying gene dysregulation in AD. However, it is still challenging to understand how cell type networks work abnormally under AD. To address this, we integrated single-cell multi-omics data and predicted the gene regulatory networks in AD and control for four major cell types, excitatory and inhibitory neurons, microglia and oligodendrocytes. Importantly, we applied network biology approaches to analyze the changes of network characteristics across these cell types, and between AD and control. For instance, many hub TFs target different genes between AD and control (rewiring). Also, these networks show strong hierarchical structures in which top TFs (master regulators) are largely common across cell types, whereas different TFs operate at the middle levels in some cell types (e.g., microglia). The regulatory logics of enriched network motifs (e.g., feed-forward loops) further uncover cell-type-specific TF-TF cooperativities in gene regulation. The cell type networks are highly modular. Several network modules with cell-type-specific expression changes in AD pathology are enriched with AD-risk genes and putative targets of approved and pending AD drugs, suggesting possible cell-type genomic medicine in AD. Finally, using the cell type gene regulatory networks, we developed machine learning models to classify and prioritize additional AD genes. We found that top prioritized genes predict clinical phenotypes (e.g., cognitive impairment). Overall, this single-cell network biology analysis provides a comprehensive map linking genes, regulatory networks, cell types and drug targets and reveals mechanisms on cell-type gene dyregulation in AD.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.