2015
DOI: 10.1186/s13059-015-0697-y
|View full text |Cite
|
Sign up to set email alerts
|

The impact of read length on quantification of differentially expressed genes and splice junction detection

Abstract: BackgroundThe initial next-generation sequencing technologies produced reads of 25 or 36 bp, and only from a single-end of the library sequence. Currently, it is possible to reliably produce 300 bp paired-end sequences for RNA expression analysis. While read lengths have consistently increased, people have assumed that longer reads are more informative and that paired-end reads produce better results than single-end reads. We used paired-end 101 bp reads and trimmed them to simulate different read lengths, and… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

7
99
0
2

Year Published

2016
2016
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 115 publications
(108 citation statements)
references
References 13 publications
7
99
0
2
Order By: Relevance
“…And it has been demonstrated that using single‐end (SE) reads for transcriptome assembly and quantification has relatively minor effects on gene‐level DE estimates, as described previously (Gonzalez and Joly ; Chhangawala et al. ). Thus, we chose to use SE data for this work.…”
Section: Methodsmentioning
confidence: 66%
“…And it has been demonstrated that using single‐end (SE) reads for transcriptome assembly and quantification has relatively minor effects on gene‐level DE estimates, as described previously (Gonzalez and Joly ; Chhangawala et al. ). Thus, we chose to use SE data for this work.…”
Section: Methodsmentioning
confidence: 66%
“…Six barcoded libraries were created, which were pooled together and sequenced in one full lane of the flow cell on an Illumina HiSeq 2500. Single-end reads of 50 base pair (bp), which are sufficient to reliably identify mRNA expression differences relative to paired-end longer reads, 13 were obtained. TopHat 14 was used to map reads to the reference genome (Build 38 and Ensembl 78) for each sample (max mismatches 15 and maximum insertion and deletion lengths 18).…”
Section: Rna Sequencingmentioning
confidence: 99%
“…If a high quality and well annotated reference sequence is available, increasing read length above 50 bp is unnecessary for accurate detection of differential expression (Chhangawala et al, 2015). Similarly, sequencing of paired-end instead of single-end reads does not significantly affect detection of differential expression in these cases (Chhangawala et al, 2015).…”
Section: Library Construction and Sequencingmentioning
confidence: 99%
“…If a high quality and well annotated reference sequence is available, increasing read length above 50 bp is unnecessary for accurate detection of differential expression (Chhangawala et al, 2015). Similarly, sequencing of paired-end instead of single-end reads does not significantly affect detection of differential expression in these cases (Chhangawala et al, 2015). Conversely, when studying organisms with less well defined reference sequences, sequencing of longer paired-end reads increases the accuracy of splice junction detection (Chhangawala et al, 2015).…”
Section: Library Construction and Sequencingmentioning
confidence: 99%
See 1 more Smart Citation