2020
DOI: 10.1101/2020.08.05.238949
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Bayesian estimation of cell-type-specific gene expression per bulk sample with prior derived from single-cell data

Abstract: When assessed over a large number of samples, bulk RNA sequencing provides reliable data for gene expression at the tissue level. Single-cell RNA sequencing (scRNA-seq) deepens those analyses by evaluating gene expression at the cellular level. Both data types lend insights into disease etiology. With current technologies, however, scRNA-seq data are known to be noisy. Moreover, constrained by costs, scRNA-seq data are typically generated from a relatively small number of subjects, which limits their utility f… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
3
3

Relationship

2
4

Authors

Journals

citations
Cited by 10 publications
(6 citation statements)
references
References 38 publications
0
6
0
Order By: Relevance
“…However in practice, this is not always possible. Instead, we use the bMIND algorithm ( Wang et al , 2021 ) to estimate CTS gene expression for each subject in the bulk data. The output of bMIND can be viewed as the average of denoised single-cell data for the subjects in the bulk data.…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…However in practice, this is not always possible. Instead, we use the bMIND algorithm ( Wang et al , 2021 ) to estimate CTS gene expression for each subject in the bulk data. The output of bMIND can be viewed as the average of denoised single-cell data for the subjects in the bulk data.…”
Section: Resultsmentioning
confidence: 99%
“…Knowledge of marker genes gives insights into the core set of genes whose expression is shared among all cells of a given type, and will fill critical gaps in our understanding of cell biology and possibly the cellular origins of pathologies ( Kelley et al , 2018 ). Marker genes are used to annotate cell clusters ( Kiselev et al , 2017 ), to study cellular composition of bulk tissues ( Kelley et al , 2018 ; Luecken and Theis, 2019 ; Oldham et al , 2008 ; Xu et al ., 2013 ), to estimate cell type fraction via deconvolution ( Abbas et al , 2009 ; Avila Cobos et al , 2018 ; Gaujoux and Seoighe, 2012 ; Newman et al , 2015 ; Zhong et al ., 2013 ) and to estimate CTS expression directly from bulk tissue ( Wang et al , 2020 , 2021 ).…”
Section: Introductionmentioning
confidence: 99%
“…However in practice, this is not always possible. Instead, we can use the bMIND algorithm (Wang et al, 2020b) to estimate CTS gene expression for each subject in the bulk data. The output of bMIND can be viewed as the average of denoised single-cell data for the subjects in the bulk data.…”
Section: Resultsmentioning
confidence: 99%
“…With the help of good marker genes, many deconvolution methods can provide accurate estimates of cell type fractions (Zhong et al, 2013; Gaujoux and Seoighe, 2013; Newman et al, 2015; Hunt et al, 2018; Newman et al, 2019). Furthermore, cell type fractions are input of methods such as MIND (Wang et al, 2020a) and bMIND (Wang et al, 2020b) to estimate CTS expression profiles from bulk tissue samples, permitting cell-type analysis for features such as eQTLs. The performance of these algorithms is highly dependent on the selection of good marker genes, hence MarkerPen can play a critical role in the analysis of CTS expression.…”
Section: Conclusion and Discussionmentioning
confidence: 99%
“…When multiple measures of bulk-tissue expression from the same individuals are available, population-level deconvolution methods, such as Convex Analysis of Mixtures (CAM - unsupervised) (Wang, et al, 2016) or Multimeasure Individual Deconvolution (MIND - supervised) (Wang, et al, 2020), can be readily applied but with reduced statistical power and subtype-resolution. Correspondingly, some semi-supervised methods have recently been proposed to exploit single-measure bulk data, including Tensor Composition Analysis (TCA) on DNA methylation data (Rahmani, et al, 2019), CIBERSORTx and Bayesian MIND (bMIND) on gene expression data (Newman, et al, 2019; Wang, et al, 2020). TCA works specifically on DNA methylation data, based on an assumed model similar to MIND, and requires a priori knowledge or estimate of subtype proportions, CIBERSORTx relies on subtype expression signatures derived from single-cell or bulk-sorted reference profiles and uses pseudo non-negative least squares to achieve high-resolution expression purifications leveraging grouped sample structures, and bMIND uses again information from scRNA-seq data fully, as prior information, to refine subtype expression estimates per bulk sample.…”
Section: Introductionmentioning
confidence: 99%