2017
DOI: 10.1101/220129
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Differential gene expression analysis tools exhibit substandard performance for long non-coding RNA–sequencing data

Abstract: Background Protein-coding RNAs (mRNA) have been the primary target of most transcriptome studies in the past, but in recent years, attention has expanded to include long non-coding RNAs (lncRNA). lncRNAs are typically expressed at low levels, and are inherently highly variable. This is a fundamental challenge for differential expression (DE) analysis. In this study, the performance of 14 popular tools for testing DE in RNA-seq data along with their normalization methods is comprehensively evaluated, with a par… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
32
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 16 publications
(34 citation statements)
references
References 54 publications
(49 reference statements)
2
32
0
Order By: Relevance
“…This result is in line with a previous study [8] that demonstrated that the number of replicates is more important than the sequencing depth to maintain the power, particularly for moderate to highly expressed genes. It is also essential to note that the power calculation (4) does not take into account the library size variability, which may compromise the power of the test [6]. Of note, pooling seems to be an effective strategy to maintain the power and reduce the cost, especially for low and moderately expressed genes ( Figure 3, Supplementary Figures S2 and S3).…”
Section: Resultsmentioning
confidence: 99%
See 3 more Smart Citations
“…This result is in line with a previous study [8] that demonstrated that the number of replicates is more important than the sequencing depth to maintain the power, particularly for moderate to highly expressed genes. It is also essential to note that the power calculation (4) does not take into account the library size variability, which may compromise the power of the test [6]. Of note, pooling seems to be an effective strategy to maintain the power and reduce the cost, especially for low and moderately expressed genes ( Figure 3, Supplementary Figures S2 and S3).…”
Section: Resultsmentioning
confidence: 99%
“…The third strategy, reducing the number of replicates, is generally worse in terms of power, yet it reduces the total cost significantly. In summary, an RNA sample pooling strategy can be a good choice to optimize the power and data generation cost, especially when many of the genes are expressed at low or medium levels like long-non-coding RNAs [6] with a substantial reduction of the library and sequencing costs. Of note, for gene expression levels with a small biological variability (represented by a negative binomial dispersion φ = 0.5) and large LFCs (θ = 1), all strategies seem to be equally effective.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Low‐abundant genes are typically more challenging for accurate quantification, which results in higher variance that negatively affects differential expression analysis. Therefore, accurate differential expression analysis of lncRNAs requires larger sample sizes compared to mRNA‐focused analyses . One attractive technology that can help increase lncRNA detection sensitivity and reproducibility is (lnc)RNA capture sequencing .…”
Section: Technical Challenges Related With Lncrna Profilingmentioning
confidence: 99%