2017
DOI: 10.1093/nar/gkx199
|View full text |Cite
|
Sign up to set email alerts
|

Modeling bias and variation in the stochastic processes of small RNA sequencing

Abstract: The use of RNA-seq as the preferred method for the discovery and validation of small RNA biomarkers has been hindered by high quantitative variability and biased sequence counts. In this paper we develop a statistical model for sequence counts that accounts for ligase bias and stochastic variation in sequence counts. This model implies a linear quadratic relation between the mean and variance of sequence counts. Using a large number of sequencing datasets, we demonstrate how one can use the generalized additiv… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
14
0

Year Published

2017
2017
2024
2024

Publication Types

Select...
4
3

Relationship

3
4

Authors

Journals

citations
Cited by 10 publications
(14 citation statements)
references
References 95 publications
0
14
0
Order By: Relevance
“…While fully joint parameter inference algorithms will certainly be more accurate, they are unwieldy and computationally intensive with large scale datasets boasting a large number of features with high sparsity. A case in point is the GAMLSS methodology [58], which improved over our pipeline (Wrench normalization coupled with edgeR differential abundance analysis) in a small scale equimolar miRNA benchmarking dataset (Additional file 1: Figure S23), but could not run to completion even in the simplest of our metagenomic datasets, the mouse gut microbiome. Second, our simulation results indicate that the performance of Wrench stabilizes by 10 − 20 samples per group depending on sample depth and the fraction of features that change across conditions.…”
Section: Discussionmentioning
confidence: 99%
“…While fully joint parameter inference algorithms will certainly be more accurate, they are unwieldy and computationally intensive with large scale datasets boasting a large number of features with high sparsity. A case in point is the GAMLSS methodology [58], which improved over our pipeline (Wrench normalization coupled with edgeR differential abundance analysis) in a small scale equimolar miRNA benchmarking dataset (Additional file 1: Figure S23), but could not run to completion even in the simplest of our metagenomic datasets, the mouse gut microbiome. Second, our simulation results indicate that the performance of Wrench stabilizes by 10 − 20 samples per group depending on sample depth and the fraction of features that change across conditions.…”
Section: Discussionmentioning
confidence: 99%
“…Researching microRNA biomarkers in the COMPASS population which is at extremely high risk of diabetic CKD, not only addresses an obvious research disparity but may inform future studies about the utility of these biomarkers in the general population. In fact the COMPASS project has already generated significant methodological advances in the statistical techniques for the analysis of microRNA data [ 17 ]. These techniques not only address the variability [ 29 , 30 ] and bias [ 31 33 ] in the short RNA sequencing measurements, but are able to estimate group differences in expression with high accuracy and precision.…”
Section: Discussionmentioning
confidence: 99%
“…Individual library concentrations are measured using the NEBNext Library Quant Kit for Illumina (New England Biolabs, Ipswich MA) and adjusted to a final pooled concertation of 2 nM and run on NextSeq sequencer (Illumina, San Diego CA). For a detailed overview of the steps involved in the construction of microRNA libraries see the supplementary methods of our short RNA sequencing analysis companion paper [ 17 ].…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Moreover, because the biases vary depending on the library preparation protocol, comparing data generated by two different sample prep methods is difficult [101,102]. Although attempts have been made to alleviate bias in sRNAseq through modifications in library preparation methods [100,103,104] or to compensate for it during data analysis [105], it remains a significant issue that will need to be addressed in the future.…”
Section: Mirna Detectionmentioning
confidence: 99%