2021
DOI: 10.1186/s40246-021-00308-5
|View full text |Cite
|
Sign up to set email alerts
|

High heterogeneity undermines generalization of differential expression results in RNA-Seq analysis

Abstract: Background RNA sequencing (RNA-Seq) has been widely applied in oncology for monitoring transcriptome changes. However, the emerging problem that high variation of gene expression levels caused by tumor heterogeneity may affect the reproducibility of differential expression (DE) results has rarely been studied. Here, we investigated the reproducibility of DE results for any given number of biological replicates between 3 and 24 and explored why a great many differentially expressed genes (DEGs) … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

3
25
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
8
1

Relationship

2
7

Authors

Journals

citations
Cited by 25 publications
(29 citation statements)
references
References 43 publications
3
25
1
Order By: Relevance
“…However, as shown in Figure 1, when (FDR(0.05), LFC(1)) is applied, the number of DEGs detected positively correlates with n if n is below 10. Similar phenomena have been reported in previous studies using RNA‐seq read count data from mouse strains (Soneson & Delorenzi, 2013), yeast (Schurch et al., 2016), tomato plants (Lamarre et al., 2018), and human tissues (Cui et al., 2021), and cell lines (Liu et al., 2014). The results of the parallel analysis using HeLa cells (Supporting information Figures S7‐S9) are also consistent with those obtained from human tumor and normal tissue samples.…”
Section: Discussionsupporting
confidence: 86%
See 1 more Smart Citation
“…However, as shown in Figure 1, when (FDR(0.05), LFC(1)) is applied, the number of DEGs detected positively correlates with n if n is below 10. Similar phenomena have been reported in previous studies using RNA‐seq read count data from mouse strains (Soneson & Delorenzi, 2013), yeast (Schurch et al., 2016), tomato plants (Lamarre et al., 2018), and human tissues (Cui et al., 2021), and cell lines (Liu et al., 2014). The results of the parallel analysis using HeLa cells (Supporting information Figures S7‐S9) are also consistent with those obtained from human tumor and normal tissue samples.…”
Section: Discussionsupporting
confidence: 86%
“…Poor reproducibility of DEGs has been shown in several studies using different datasets from tomato plants (Lamarre et al., 2018), yeast (Schurch et al., 2016), mouse strains (Soneson & Delorenzi, 2013), and human tissues(Cui et al., 2021), and cell lines (Liu et al., 2014); however, the findings of all these studies were based on DE analyses using the common threshold value for FC (│log2FC│≥ 1) and FDR (FDR < 0.05). The question was whether the reproducibility of DE results could be improved by increasing significance stringency.…”
Section: Introductionmentioning
confidence: 99%
“…Some challenges for single source RNA-seq data that significantly affect the generalization of the analysis are the varying data acquisition protocols [ 71 ], intratumor heterogeneity [ 72 ] and local mutation burden [ 73 ]. These are prominent aspects of non-small cell lung carcinomas.…”
Section: Discussionmentioning
confidence: 99%
“…High-throughput analyses of gene expression hold great promise for the identification of biomarkers of clinical status, with the potential of predicting outcome, response to therapy, or informing researchers about molecular mechanisms underpinning disease onset and progression and identifying therapeutic targets [1]. Nevertheless, lists of candidate genes obtained through transcriptome-based studies have proven difficult to reproduce [2][3][4][5][6], raising a note of caution regarding conclusions driven by single sets of experiments. Sample collection and processing methods, protocols, and platforms may impact on the resulting gene signatures, making them non-overlapping between studies [7].…”
Section: Introductionmentioning
confidence: 99%