2020
DOI: 10.1186/s13059-020-02136-7
|View full text |Cite
|
Sign up to set email alerts
|

pipeComp, a general framework for the evaluation of computational pipelines, reveals performant single cell RNA-seq preprocessing tools

Abstract: We present pipeComp (https://github.com/plger/pipeComp), a flexible R framework for pipeline comparison handling interactions between analysis steps and relying on multi-level evaluation metrics. We apply it to the benchmark of single-cell RNA-sequencing analysis pipelines using simulated and real datasets with known cell identities, covering common methods of filtering, doublet detection, normalization, feature selection, denoising, dimensionality reduction, and clustering. pipeComp can easily integrate any o… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

6
104
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
6
3

Relationship

1
8

Authors

Journals

citations
Cited by 89 publications
(110 citation statements)
references
References 74 publications
(129 reference statements)
6
104
0
Order By: Relevance
“…Furthermore, Pagoda2 and Vision had a potent risk to report error when using log-transform method to normalize expression data for calculating PAS. An earlier study mentioned that sctransform outperformed other normalization tools in scRNA-seq analysis [47] . Therefore, we examined the performance of these PAS tools based on sctransform-normalized data without gene filtering in this study.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Furthermore, Pagoda2 and Vision had a potent risk to report error when using log-transform method to normalize expression data for calculating PAS. An earlier study mentioned that sctransform outperformed other normalization tools in scRNA-seq analysis [47] . Therefore, we examined the performance of these PAS tools based on sctransform-normalized data without gene filtering in this study.…”
Section: Resultsmentioning
confidence: 99%
“…To evaluate ability of these tools on meaningfully extracting transcriptional heterogeneity, we assessed that the cell-type-specific obtained from PAS should be retained in dimensional reductional space and could assign cells into cell populations through unsupervised clustering or supervised classification [20] , [24] , [26] , [32] , [33] , [34] , [35] , [36] , [37] , [38] , [39] , [40] . Therefore, the accuracy of PAS transformation methods were assessed by three methods of dimensional reduction, clustering and cell type annotation.…”
Section: Methodsmentioning
confidence: 99%
“…In brief, we first decontaminated the cells from ambient RNA using SoupX 1.4.5 (Young and Behjati, 2020), using cell clusters obtained from the “fastcluster” function of scDblFinder 1.4.0 . We then removed doublet identified by scDblFinder (Germain et al, 2020), and additionally removed cells (using scater 1.18.0 (McCarthy et al, 2017)) which had more than 9% mitochondrial reads or departed from the median by more than 3 median absolute deviations on either of the following cell QC metrics: log(UMI counts), log(features detected), or percent of UMI in the top 50 features (filtering out higher only). We next normalized the data using sctransform 0.3.1 (Hafemeister and Satija, 2019), and ran PCA and clustering at various resolutions using Seurat 3.2.2 (Stuart et al, 2019).…”
Section: Methodsmentioning
confidence: 99%
“…Principal components were identified with variable genes using the RunPCA() function. The PCs for clustering were identified using the maxLikGlobalDimEst function from the intrinsicDimension 32 package which is described as best practice 33 . These 10 PCs were used to identify clusters and with the FindClusters() function at a resolution of 0.2.…”
Section: Methods Detailsmentioning
confidence: 99%