2021
DOI: 10.1101/2021.08.04.453579
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A comparison of data integration methods for single-cell RNA sequencing of cancer samples

Abstract: Tumours are routinely profiled with single-cell RNA sequencing (scRNA-seq) to characterize their diverse cellular ecosystems of malignant, immune, and stromal cell types. When combining data from multiple samples or studies, batch-specific technical variation can confound biological signals. However, scRNA-seq batch integration methods are often not designed for, or benchmarked, on datasets containing cancer cells. Here, we compare 5 data integration tools applied to 171,206 cells from 5 tumour scRNA-seq datas… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
10
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 9 publications
(11 citation statements)
references
References 41 publications
1
10
0
Order By: Relevance
“…Furthermore, while single-cell RNA-Seq comparisons of engineered versus in vivo cells can distinguish hybrid identity from population heterogeneity (Han et al, 2020;Shapiro et al, 2013;Tang et al, 2011), the methods typically used to integrate scRNAseq data from multiple sources (i.e. when comparing CFE products to a previously published single cell atlas) tend to over-merge cells with distinct cell states (Babcock et al 2021) (Richards et al, 2021). These issues make it challenging to fairly compare CFE products across protocols and studies.…”
Section: Introductionmentioning
confidence: 99%
“…Furthermore, while single-cell RNA-Seq comparisons of engineered versus in vivo cells can distinguish hybrid identity from population heterogeneity (Han et al, 2020;Shapiro et al, 2013;Tang et al, 2011), the methods typically used to integrate scRNAseq data from multiple sources (i.e. when comparing CFE products to a previously published single cell atlas) tend to over-merge cells with distinct cell states (Babcock et al 2021) (Richards et al, 2021). These issues make it challenging to fairly compare CFE products across protocols and studies.…”
Section: Introductionmentioning
confidence: 99%
“…To integrate snRNA-seq data across all of our samples, we used reciprocal principal component analysis (RPCA), as implemented in Seurat 51,128 . First, we identified 2000 highly variable features (genes) across all of the samples to use as integration features using the ‘SelectIntegrationFeatures()’ function, which we passed as anchor features (‘anchor.features’) to the ‘PrepSCTIntegration()’ function.…”
Section: Methodsmentioning
confidence: 99%
“…Given this difficulty, we sought to determine if different levels of imbalance between epithelial normal and epithelial tumor compartments can influence the accuracy of PDAC tumor tissue integration and subsequent classification of tumor cells. As scRNA-seq data from tumor tissue is often integrated across multiple biopsy sites, patients, and cohorts 45 , the ability to reliably quantify tumor cells is imperative to the biological validity of subsequent downstream analyses. We 16 preprocessed and annotated tumor cells in the PDAC samples through integration, unsupervised clustering, and marker gene-based annotation (Online Methods).…”
Section: Perturbation Analysis In Pdac Samples Reveals Tumor Compartm...mentioning
confidence: 99%
“…Given this difficulty, we sought to determine if different levels of imbalance between epithelial normal and epithelial tumor compartments can influence the accuracy of PDAC tumor tissue integration and subsequent classification of tumor cells. As scRNA-seq data from tumor tissue is often integrated across multiple biopsy sites, patients, and cohorts [42],…”
Section: Perturbation Analysis In Pdac Samples Reveals Tumor Compartm...mentioning
confidence: 99%