Single cell RNA-sequencing (scRNA-Seq) allows researchers to profile transcriptional activity in individual cells. However, the complex nature of these data and variability in study design and data generation requires sophisticated computational tools and informed analytical decisions. Here, we present the Single Cell Toolkit (SCTK), an interactive scRNA-Seq analysis package that enables users to perform scRNA-Seq analysis interactively using a command-line workflow or a graphical user interface (GUI) written in R/Shiny.
Main TextSingle cell RNA-sequencing (scRNA-Seq) techniques allow researchers to explore the transcriptional landscape of a sample at the resolution of the individual cell. In the context of cancer, scRNA-Seq can identify the subclonality of a tumor sample to improve our ability to identify the cell-specific mechanisms that drive tumor growth and can characterize different cellular populations within the tumor microenvironment such as immune cells 1,2 . However, different optimizations of parameters and algorithms are required for filtration, normalization, clustering, and differential expression of scRNA-Seq data compared to bulk RNA-seq due to the low amount of starting material and technical bias introduced in the common scRNA-Seq library preparation techniques 3 . Tools for normalization and analysis of scRNA-Seq data exist to overcome these technical biases, but these tools are not integrated and require command line processing of samples and knowledge of the many options available for each tool, which makes them difficult to use, especially for scientists without training in bioinformatics [4][5][6][7][8][9] . Even for more advanced users, there is still a need to interactively explore scRNA-Seq results during processing to help make dataset specific decisions that can affect downstream analysis.Here, we present the Single Cell Toolkit (SCTK), an R/Shiny 10 based package for both command line and interactive scRNA-Seq processing. Users can upload count and annotation data and interactively explore and perform analyses. Data and results can be saved in a convenient object for downstream command line analysis, or to reload into the GUI in another session. With the SCTK, it is possible to perform a full analysis workflow from uploading raw data to downloading processed results. While other tools can perform specific scRNA-Seq analysis steps, the SCTK is the first fully interactive scRNA-Seq analysis workflow available within the R language ( Table 1).. CC-BY 4.0 International license It is made available under a (which was not peer-reviewed) is the author/funder, who has granted bioRxiv a license to display the preprint in perpetuity. The SCTK is organized into several analysis modules ( Fig. 1). Analysis modules include data summary and filtering, dimensionality reduction, clustering, batch correction, differential expression, pathway activity analysis, and power calculations to evaluate the tradeoff between sample size, cell numbers, and sequencing depths. All analysis modules can be...