2022
DOI: 10.1093/bioinformatics/btac499
|View full text |Cite
|
Sign up to set email alerts
|

SimBu: bias-aware simulation of bulk RNA-seq data with variable cell-type composition

Abstract: Motivation As complex tissues are typically composed of various cell types, deconvolution tools have been developed to computationally infer their cellular composition from bulk RNA sequencing (RNA-seq) data. To comprehensively assess deconvolution performance, gold-standard datasets are indispensable. Gold-standard, experimental techniques like flow cytometry or immunohistochemistry are resource-intensive and cannot be systematically applied to the numerous cell types and tissues profiled wi… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
5

Relationship

0
10

Authors

Journals

citations
Cited by 14 publications
(15 citation statements)
references
References 43 publications
0
5
0
Order By: Relevance
“…We excluded sample 2428 due to an insufficient number of cells. We used SimBu [59] to create four datasets of 50 pseudo-bulk samples each, spanning a range of potential scenarios that a deconvolution method should be able to accurately characterize (Figure 5A). Each pseudo-bulk sample consisted of count data from 2000 single cells, sampled according to the scenario parameters.…”
Section: Deconvolution Methods Have Cell Type Bias In Real and Pseudo...mentioning
confidence: 99%
“…We excluded sample 2428 due to an insufficient number of cells. We used SimBu [59] to create four datasets of 50 pseudo-bulk samples each, spanning a range of potential scenarios that a deconvolution method should be able to accurately characterize (Figure 5A). Each pseudo-bulk sample consisted of count data from 2000 single cells, sampled according to the scenario parameters.…”
Section: Deconvolution Methods Have Cell Type Bias In Real and Pseudo...mentioning
confidence: 99%
“…We have demonstrated the utility of this type of inference in biomedical applications by reanalyzing published data of pediatric ependymal tumors and have discovered a previously unreported implication of microglial neurodegenerative gene expression programs in the mesenchymal transformation of these tumors. We expect that the development of advanced methods for simulating realistic bulk gene expression data from single-cell RNA-seq data 54 will improve the predictive power of ConDecon. In addition, we have shown that the approach of ConDecon can be adapted to other omics data modalities, such as spatial transcriptomics and chromatin accessibility data, for which there is currently a scarcity of deconvolution approaches.…”
Section: Discussionmentioning
confidence: 99%
“…One main challenge has been limited availability of “ground truth” or “gold/silver standard” cell type proportions against which deconvolution methods can be benchmarked. In the absence of cell type proportion standards, it has been common to pseudobulk sc/snRNA-seq data or use mixture simulations to generate pseudobulk RNA-seq data, use the same sc/snRNA-seq data as the reference, and compare the deconvolution results against the cell type proportions observed in the sc/snRNA-seq data [1820, 23, 24]. However, sc/snRNA-seq library preparation protocols have filtering steps that can introduce biases in the estimated cell type proportions [21], limiting their use as “ground truth” references.…”
Section: Introductionmentioning
confidence: 99%