2020
DOI: 10.1371/journal.pone.0232271
|View full text |Cite
|
Sign up to set email alerts
|

Benchmarking RNA-seq differential expression analysis methods using spike-in and simulation data

Abstract: Benchmarking RNA-seq differential expression analysis methods using spike-in and simulated RNA-seq data has often yielded inconsistent results. The spike-in data, which were generated from the same bulk RNA sample, only represent technical variability, making the test results less reliable. We compared the performance of 12 differential expression analysis methods for RNA-seq data, including recent variants in widely used software packages, using both RNA spike-in and simulation data for negative binomial (NB)… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
32
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
8
1

Relationship

1
8

Authors

Journals

citations
Cited by 38 publications
(32 citation statements)
references
References 44 publications
0
32
0
Order By: Relevance
“…2D and Table ). The fold change in 1.23 is well within the benchmark that is widely adopted in transcriptomic studies [26,44,45]. Enrichment analysis of selected DEGs revealed that WBP2 is involved in multiple signaling pathways (Table 1).…”
Section: Resultsmentioning
confidence: 82%
“…2D and Table ). The fold change in 1.23 is well within the benchmark that is widely adopted in transcriptomic studies [26,44,45]. Enrichment analysis of selected DEGs revealed that WBP2 is involved in multiple signaling pathways (Table 1).…”
Section: Resultsmentioning
confidence: 82%
“…10%, 30%, or 60% genes were made differentially expressed (DE) with 1.3 or larger fold changes. See our paper 20 for more detailed method to simulate read count data. The count data were then voom-transformed for moderated t -test (DE analysis) 21 .…”
Section: Methodsmentioning
confidence: 99%
“…For RNA-seq data, DESeq 2 [53] was chosen to perform differential expression analysis, which is an R package available within the Bioconductor project [54]. This is because DESeq 2 is widely used by the community, including Lindner et al that published the analysed dataset [37], and it has also been recently proved to have the best overall performance among 12 methods [55]. Since DESeq2 requires read counts as an input, while StringTie outputs coverage values for transcript abundance, these were first converted from coverage to counts for each transcript, using the formula reads_per_transcript = coverage * transcript_len/read_len with a python script (available at http://ccb.jhu.edu/softw are/strin gtie/dl/ prepD E.py).…”
Section: Differential Expression Analysismentioning
confidence: 99%