We develop CellSIUS (Cell Subtype Identification from Upregulated gene Sets) to fill a methodology gap for rare cell population identification for scRNA-seq data. CellSIUS outperforms existing algorithms for specificity and selectivity for rare cell types and their transcriptomic signature identification in synthetic and complex biological data. Characterization of a human pluripotent cell differentiation protocol recapitulating deep-layer corticogenesis using CellSIUS reveals unrecognized complexity in human stem cell-derived cellular populations. CellSIUS enables identification of novel rare cell populations and their signature genes providing the means to study those populations in vitro in light of their role in health and disease. Electronic supplementary material The online version of this article (10.1186/s13059-019-1739-7) contains supplementary material, which is available to authorized users.
Highlights d A universal and scalable genetic platform in hPSCs for general use across all lineages d Robust knockout efficiencies translate into high-performance screening at genome scale d Stem cell-specific components of TP53 and OCT4 genetic networks in hPSCs are identified d Validation of PMAIP1 and PAWR function in sensitivity to DNA damage or dissociation
15Human pluripotent stem cells (hPSCs) generate a wide variety of disease-relevant cells that can 16 be used to improve the translation of preclinical research. Despite the potential of hPSCs, their 17 use for genetic screening has been limited because of technical challenges. We developed a 18 renewable Cas9/sgRNA-hPSC library where loss-of-function mutations can be induced at will. 19Our inducible-mutant hPSC library can be used for an unlimited number of genome-wide 20 screens. We screened for novel genes involved in 3 of the fundamental properties of hPSCs: 21Their ability to self-renew/survive, their capacity to differentiate into somatic cells, and their 22 inability to survive as single-cell clones. We identified a plethora of novel genes with unidentified 23 roles in hPSCs. These results are available as a resource for the community to increase the 24 understanding of both human development and genetics. In the future, our stem cell library 25 approach will be a powerful tool to identify disease-modifying genes. 27 VISUAL ABSTRACT 28All rights reserved. No reuse allowed without permission.(which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity.
1Comprehensive benchmarking of computational methods for single-cell RNA sequencing 2 (scRNA-seq) analysis is scarce. Using a modular workflow and a large dataset with known cell 3 composition, we benchmarked feature selection and clustering methodologies for scRNA-seq 4 data. Results highlighted a methodology gap for rare cell population identification for which we 5 developed CellSIUS (Cell Subtype Identification from Upregulated gene Sets). CellSIUS 6 outperformed existing approaches, enabled the identification of rare cell populations and, in 7 contrast to other methods, simultaneously revealed transcriptomic signatures indicative of the 8 rare cells' function. We exemplified the use of our workflow and CellSIUS for the 9 characterization of a human pluripotent cell 3D spheroid differentiation protocol recapitulating 10 deep-layer corticogenesis in vitro. Results revealed lineage bifurcation between Cajal-Retzius 11 cells and layer V/VI neurons as well as rare cell populations that differ by migratory, metabolic, 12 or cell cycle status, including a choroid plexus neuroepithelial subgroup, revealing previously 13 unrecognized complexity in human stem cell-derived cellular populations. 14 Keywords 15 Single-cell RNA sequencing, data analysis, rare cell types, clustering, software, benchmarking, 16 human pluripotent stem cells, cortical development, choroid plexus, lineage mapping. 17 18On the full dataset, most methods resulted in a perfect assignemnt ( Figure 2F) with only two of 128 the stochastic methods -pcaReduce and mclust -yielding an average ARI of 0.90 and 0.92. In 129 contrast, on subset 1, where cell type proportions were no longer equal, k-means based methods 130 and mclust failed to identify the different cell types correctly and resulted in average ARI of 0.85 131 (SC3), 0.78 (pcaReduce) and 0.69 (mclust) ( Figure 1G). On subset 2, all methods failed to 132
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.