Existing computational methods that use single-cell RNA-sequencing (scRNA-seq) for cell fate prediction do not model how cells evolve stochastically and in physical time, nor can they predict how differentiation trajectories are altered by proposed interventions. We introduce PRESCIENT (Potential eneRgy undErlying Single Cell gradIENTs), a generative modeling framework that learns an underlying differentiation landscape from time-series scRNA-seq data. We validate PRESCIENT on an experimental lineage tracing dataset, where we show that PRESCIENT is able to predict the fate biases of progenitor cells in hematopoiesis when accounting for cell proliferation, improving upon the best-performing existing method. We demonstrate how PRESCIENT can simulate trajectories for perturbed cells, recovering the expected effects of known modulators of cell fate in hematopoiesis and pancreatic β cell differentiation. PRESCIENT is able to accommodate complex perturbations of multiple genes, at different time points and from different starting cell populations, and is available at https://github.com/gifford-lab/prescient.
SummaryExisting computational methods that use single-cell RNA-sequencing for cell fate prediction either summarize observations of cell states and their couplings without modeling the underlying differentiation process, or are limited in their capacity to model complex differentiation landscapes. Thus, contemporary methods cannot predict how cells evolve stochastically and in physical time from an arbitrary starting expression state, nor can they model the cell fate consequences of gene expression perturbations. We introduce PRESCIENT (Potential eneRgy undErlying Single Cell gradIENTs), a generative modeling framework that learns an underlying differentiation landscape from single-cell time-series gene expression data. Our generative model framework provides insight into the process of differentiation and can simulate differentiation trajectories for arbitrary gene expression progenitor states. We validate our method on a recently published experimental lineage tracing dataset that provides observed trajectories. We show that this model is able to predict the fate biases of progenitor cells in neutrophil/macrophage lineages when accounting for cell proliferation, improving upon the best-performing existing method. We also show how a model can predict trajectories for cells not found in the model’s training set, including cells in which genes or sets of genes have been perturbed. PRESCIENT is able to accommodate complex perturbations of multiple genes, at different time points and from different starting cell populations. PRESCIENT models are able to recover the expected effects of known modulators of cell fate in hematopoiesis and pancreatic β cell differentiation.
Background
A large fraction of human and mouse autosomal genes are subject to random monoallelic expression (MAE), an epigenetic mechanism characterized by allele-specific gene expression that varies between clonal cell lineages. MAE is highly cell-type specific and mapping it in a large number of cell and tissue types can provide insight into its biological function. Its detection, however, remains challenging.
Results
We previously reported that a sequence-independent chromatin signature identifies, with high sensitivity and specificity, genes subject to MAE in multiple tissue types using readily available ChIP-seq data. Here we present an implementation of this method as a user-friendly, open-source software pipeline for
m
ono
a
llelic
g
ene
i
nference from
c
hromatin (MaGIC). The source code for the MaGIC pipeline and the Shiny app is available at
https://github.com/gimelbrantlab/magic
.
Conclusion
The pipeline can be used by researchers to map monoallelic expression in a variety of cell types using existing models and to train new models with additional sets of chromatin marks.
Electronic supplementary material
The online version of this article (10.1186/s12859-019-2679-7) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.