Background The cell cycle is a highly conserved, continuous process which controls faithful replication and division of cells. Single-cell technologies have enabled increasingly precise measurements of the cell cycle both as a biological process of interest and as a possible confounding factor. Despite its importance and conservation, there is no universally applicable approach to infer position in the cell cycle with high-resolution from single-cell RNA-seq data. Results Here, we present tricycle, an R/Bioconductor package, to address this challenge by leveraging key features of the biology of the cell cycle, the mathematical properties of principal component analysis of periodic functions, and the use of transfer learning. We estimate a cell-cycle embedding using a fixed reference dataset and project new data into this reference embedding, an approach that overcomes key limitations of learning a dataset-dependent embedding. Tricycle then predicts a cell-specific position in the cell cycle based on the data projection. The accuracy of tricycle compares favorably to gold-standard experimental assays, which generally require specialized measurements in specifically constructed in vitro systems. Using internal controls which are available for any dataset, we show that tricycle predictions generalize to datasets with multiple cell types, across tissues, species, and even sequencing assays. Conclusions Tricycle generalizes across datasets and is highly scalable and applicable to atlas-level single-cell RNA-seq data.
The cell cycle is a highly conserved, continuous process which controls faithful replication and division of cells. Single-cell technologies have enabled increasingly precise measurements of the cell cycle as both as a biological process of interest and as a possible confounding factor. Despite its importance and conservation, there is no universally applicable approach to infer position in the cell cycle with high-resolution from single-cell RNA-seq data. Here, we present tricycle, an R/Bioconductor package, to address this challenge by leveraging key features of the biology of the cell cycle, the mathematical properties of principal component analysis of periodic functions, and the ubiquitous applicability of transfer learning. We show that tricycle can predict any cell’s position in the cell cycle regardless of the cell type, species of origin, and even sequencing assay. The accuracy of tricycle compares favorably to gold-standard experimental assays which generally require specialized measurements in specifically constructed in vitro systems. Unlike gold-standard assays, tricycle is easily applicable to any single-cell RNA-seq dataset. Tricycle is highly scalable, universally accurate, and eminently pertinent for atlas-level data.
The enteric nervous system (ENS), a collection of neurons contained in the wall of the gut, is of fundamental importance to gastrointestinal and systemic health. According to the prevailing paradigm, the ENS arises from progenitor cells migrating from the embryonic neural crest and remains largely unchanged thereafter. Here, we show that the composition of maturing ENS changes with time, with a decline in neural-crest derived neurons and their replacement by mesoderm-derived neurons. Single cell transcriptomics and immunochemical approaches establish a distinct expression profile of mesoderm-derived neurons. The dynamic balance between the proportions of neurons from these two different lineages in the post-natal gut is dependent on the availability of their respective trophic signals, GDNF-RET and HGF-MET. With increasing age, the mesoderm-derived neurons become the dominant form of neurons in the ENS, a change associated with significant functional effects on intestinal motility. Normal intestinal function in the adult gastrointestinal tract therefore appears to require an optimal balance between these two distinct lineages within the ENS.
Latent space techniques have emerged as powerful tools to identify genes and gene sets responsible for cell-type and species-specific differences in single-cell data. Transfer learning methods can compare learned latent spaces across biological systems. However, the robustness that comes from leveraging information across multiple genes in transfer learning is often attained at the sacrifice of gene-wise precision. Thus, methods are needed to identify genes, defined as important within a particular latent space, that significantly differ between contexts. To address this challenge, we have developed a new framework, scProject, and a new metric, projectionDrivers, to quantitatively examine latent space usage across single cell experimental systems while concurrently extracting the genes driving the differential usage of the latent space between defined contrasts. Here, we demonstrate the efficacy, utility, and scalability of scProject with projectionDrivers and provide experimental validation for predicted species-specific differences between the developing mouse and human retina.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.