2019
DOI: 10.1101/673285
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Modular and efficient pre-processing of single-cell RNA-seq

Abstract: Analysis of single-cell RNA-seq data begins with the pre-processing of reads to generate count matrices. We investigate algorithm choices for the challenges of pre-processing, and describe a workflow that balances efficiency and accuracy. Our workflow is based on the kallisto and bustools programs, and is near-optimal in speed and memory. The workflow is modular, and we demonstrate its flexibility by showing how it can be used for RNA velocity analyses.

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
96
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
3
2

Relationship

2
7

Authors

Journals

citations
Cited by 102 publications
(97 citation statements)
references
References 46 publications
1
96
0
Order By: Relevance
“…Alignment and UMI counting tools can affect the resulting expression matrix for gene expression modalities in scRNA-seq [9]. Furthermore, different tools have marked differences in their runtime and computation requirements [9,10]. To determine if this is also the case for UMI counting methods for ADT, we compared the runtime and assignment of sequencing reads to a unique CITE-seq antibody, cell barcode and UMI combination using three methods: CITE-seq-Count, Cell Ranger (running in featureOnly mode; 10X Genomics) and Kallisto-bustools (using the KITE pipeline) [10,11].…”
Section: Adt Counting Methods Have Minor Influence On the Resulting Cmentioning
confidence: 99%
See 1 more Smart Citation
“…Alignment and UMI counting tools can affect the resulting expression matrix for gene expression modalities in scRNA-seq [9]. Furthermore, different tools have marked differences in their runtime and computation requirements [9,10]. To determine if this is also the case for UMI counting methods for ADT, we compared the runtime and assignment of sequencing reads to a unique CITE-seq antibody, cell barcode and UMI combination using three methods: CITE-seq-Count, Cell Ranger (running in featureOnly mode; 10X Genomics) and Kallisto-bustools (using the KITE pipeline) [10,11].…”
Section: Adt Counting Methods Have Minor Influence On the Resulting Cmentioning
confidence: 99%
“…Furthermore, different tools have marked differences in their runtime and computation requirements [9,10]. To determine if this is also the case for UMI counting methods for ADT, we compared the runtime and assignment of sequencing reads to a unique CITE-seq antibody, cell barcode and UMI combination using three methods: CITE-seq-Count, Cell Ranger (running in featureOnly mode; 10X Genomics) and Kallisto-bustools (using the KITE pipeline) [10,11]. To make the results more broadly applicable, in addition to the 52 antibody ADT library (ADT), we also included the 6 antibody cell hashing library used to demultiplex the samples (HTO) as well as three publicly available 17 antibody panel datasets where raw data is available (from the 10X Genomics website).…”
Section: Adt Counting Methods Have Minor Influence On the Resulting Cmentioning
confidence: 99%
“…The SMART-Seq data was processed using kallisto with the `kallisto pseudo` command 23 . The 10x Genomics v3 data was pre-processed with kallisto and bustools 38 . Gene count matrices were made by using the --genecounts flag and TCC matrices were made by omitting it.…”
Section: Pre-processing Single-cell Rna-seq Datamentioning
confidence: 99%
“…The kallisto bustools workflow [15][16][17] was used to obtain UMI [18] gene count matrices at different sampled read depths to mimic datasets sequenced at varying depths. From these subsamples, subsets of cells were sampled to emulate the sequencing of fewer cells (Figure 1).…”
Section: Resultsmentioning
confidence: 99%