2021
DOI: 10.1186/s13059-021-02367-2
|View full text |Cite|
|
Sign up to set email alerts
|

scDesign2: a transparent simulator that generates high-fidelity single-cell gene expression count data with gene correlations captured

Abstract: A pressing challenge in single-cell transcriptomics is to benchmark experimental protocols and computational methods. A solution is to use computational simulators, but existing simulators cannot simultaneously achieve three goals: preserving genes, capturing gene correlations, and generating any number of cells with varying sequencing depths. To fill this gap, we propose scDesign2, a transparent simulator that achieves all three goals and generates high-fidelity synthetic data for multiple single-cell gene ex… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
76
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8

Relationship

2
6

Authors

Journals

citations
Cited by 74 publications
(76 citation statements)
references
References 146 publications
(211 reference statements)
0
76
0
Order By: Relevance
“…Based on the specific markers used by the authors, we classified cell clusters in eight immune compartments and tumor cells of each patient. We trained a scDesign2 18 model for each cell type, specifically eight immune and 14 malignant models, one for each sample. The synthetic scRNA-seq matrices were randomly generated by choosing the following parameters: the number of total cells (between 300 and 1000), the tumor purity (between 5 and 100%), the number of cells for each immune cell type, and the scDesign2 18 malignant model from one of the 14 samples.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…Based on the specific markers used by the authors, we classified cell clusters in eight immune compartments and tumor cells of each patient. We trained a scDesign2 18 model for each cell type, specifically eight immune and 14 malignant models, one for each sample. The synthetic scRNA-seq matrices were randomly generated by choosing the following parameters: the number of total cells (between 300 and 1000), the tumor purity (between 5 and 100%), the number of cells for each immune cell type, and the scDesign2 18 malignant model from one of the 14 samples.…”
Section: Resultsmentioning
confidence: 99%
“…To perform a quantitative evaluation of the segmentation results, we generated a synthetic dataset modeling two realistic scenarios: Scenario I, with just clonal alterations and all malignant cells share the same alterations; Scenario II, where there are some clonal alterations shared by all cells and also two populations of malignant cells having subclone-specific alterations. For both scenarios, we generated synthetic matrices with different levels of magnitude of the synthetic copy number alterations, starting from matrices previously obtained using scDesign2 18 . We considered only normal diploid cells and randomly alter genomic regions generating synthetic aneuploid cells.…”
Section: Resultsmentioning
confidence: 99%
“…Ideally, one should plan a single-cell study from the outset, with a clear statement of goals and a plan for adequately powering downstream statistical analysis. Employing a power calculation or study simulation tool such as powsimR (Vieth, Ziegenhain, Parekh, Enard, & Hellmann, 2017), scPower (Schmid et al, 2021), scDesign2 (Sun, Song, Li, & Li, 2021), and POWSC can be a useful way to predict the feasibility of the stated goals, (e.g., detection of rare cell types, differential expression testing, and eQTL analysis) and allocate resources to additional biological replicates, higher cell counts, deeper sequencing, or the like, as needed. Bear in mind that such calculations rely heavily on assumptions about the sources and extent of technical noise, which can vary dramatically across different types of samples.…”
Section: Statistical Power Analysismentioning
confidence: 99%
“…Following the publication of the original paper [ 1 ], the authors identified two errors in the notation and formulas in the Methods section. …”
mentioning
confidence: 99%