2020
DOI: 10.1371/journal.pone.0239495
|View full text |Cite
|
Sign up to set email alerts
|

Sources of variation in cell-type RNA-Seq profiles

Abstract: Cell-type specific gene expression profiles are needed for many computational methods operating on bulk RNA-Seq samples, such as deconvolution of cell-type fractions and digital cytometry. However, the gene expression profile of a cell type can vary substantially due to both technical factors and biological differences in cell state and surroundings, reducing the efficacy of such methods. Here, we investigated which factors contribute most to this variation. We evaluated different normalization methods, quanti… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
32
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 20 publications
(32 citation statements)
references
References 51 publications
(73 reference statements)
0
32
0
Order By: Relevance
“…With the latter, unsupervized training is used with approaches including: 1) autoencoders, which learn efficient representations of the training data, typically for dimensionality reduction ( Way and Greene, 2018 ) or feature selection ( Xie et al, 2017 ), 2) generative adversarial networks, which learn to generate new data with the same statistics as the training set ( Wang Y. et al, 2020 ; Repecka et al, 2021 ), and 3) deep belief networks, which learn to probabilistically reconstruct their inputs, acting as feature detectors, and can be further trained with supervision to build efficient classifiers ( Bu et al, 2017 ). Moreover, the advent of single-cell HTS technologies such as single-cell RNA-seq will offer many novel research opportunities, including modeling of cell-type or cell-state specific enhancer or TFBS activations and chromatin changes ( Angermueller et al, 2017 ; Gustafsson et al, 2020 ; Kawaguchi et al, 2021 ).…”
Section: Discussionmentioning
confidence: 99%
“…With the latter, unsupervized training is used with approaches including: 1) autoencoders, which learn efficient representations of the training data, typically for dimensionality reduction ( Way and Greene, 2018 ) or feature selection ( Xie et al, 2017 ), 2) generative adversarial networks, which learn to generate new data with the same statistics as the training set ( Wang Y. et al, 2020 ; Repecka et al, 2021 ), and 3) deep belief networks, which learn to probabilistically reconstruct their inputs, acting as feature detectors, and can be further trained with supervision to build efficient classifiers ( Bu et al, 2017 ). Moreover, the advent of single-cell HTS technologies such as single-cell RNA-seq will offer many novel research opportunities, including modeling of cell-type or cell-state specific enhancer or TFBS activations and chromatin changes ( Angermueller et al, 2017 ; Gustafsson et al, 2020 ; Kawaguchi et al, 2021 ).…”
Section: Discussionmentioning
confidence: 99%
“…Since the gene expression data sets can be collected through completely different experimental settings with the use of different experimentation plans, platforms and methodologies, there are undesired batch effects in the gene expression values. These technical variations can in some cases be as large as the biological variations between different cell types [23]. It has been shown that the ComBat algorithm [24] can effectively remove these unwanted variations from bulk RNA-Seq data [23].…”
Section: Cibersortx Methodsmentioning
confidence: 99%
“…These technical variations can in some cases be as large as the biological variations between different cell types [23]. It has been shown that the ComBat algorithm [24] can effectively remove these unwanted variations from bulk RNA-Seq data [23]. Newman et al [13] introduced CIBERSORTx, which extends CIBERSORT by adding batch correction using ComBat to address the possible cross-platform variations in gene expression data sets.…”
Section: Cibersortx Methodsmentioning
confidence: 99%
“…First, high expense and technical noise (e.g., high sparsity of gene expression) limit the number of samples analyzed and quality of cell type composition estimation, leading to low power in association analysis. Second, cell type compositions measured in single cell experiments are highly dependent on the biopsy samples and do not necessarily reflect the true cell type compositions in the corresponding tissue 13 . Instead of directly calculating cell type proportions from scRNA-seq data, cell type proportions can also be inferred through deconvolution of bulk RNA-sequencing (RNA-seq) data available with larger sample sizes.…”
Section: Introductionmentioning
confidence: 99%