Many approaches have been developed to overcome technical noise in single cell (and single nucleus) RNA-sequencing (scRNAseq). As researchers dig deeper into data--looking for rare cell types, subtleties of cell states, and details of gene regulatory networks--there is a growing need for algorithms with controllable accuracy and a minimum of ad hoc parameters and thresholds. Impeding this goal is the fact that an appropriate null distribution for scRNAseq cannot simply be extracted from data in the event that ground truth about biological variation is unknown (i.e., most of the time). Here we approach this problem analytically, based on the assumption that scRNAseq data reflect only cell heterogeneity (what we seek to characterize), transcriptional noise (temporal fluctuations randomly distributed across cells), and sampling error (i.e., Poisson noise). We then analyze scRNAseq data without normalization--a step that can skew distributions, particular for sparse data--and calculate p-values associated with key statistics. We develop an improved method for the selection of features for cell clustering and the identification of gene-gene correlations, both positive and negative. Using simulated data, we show that this method, which we call BigSur (Basic Informatics and Gene Statistics from Unnormalized Reads), accurately captures even weak yet significant correlation structures in scRNAseq data. Applying BigSur to data from a clonal human melanoma cell line, we identify tens of thousands of correlations that, when clustered without supervision into gene communities, both align with cellular components and biological processes, and point toward potentially novel cell biological relationships.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.