2016
DOI: 10.1186/s13059-016-1118-6
|View full text |Cite
|
Sign up to set email alerts
|

Human splicing diversity and the extent of unannotated splice junctions across human RNA-seq samples on the Sequence Read Archive

Abstract: BackgroundGene annotations, such as those in GENCODE, are derived primarily from alignments of spliced cDNA sequences and protein sequences. The impact of RNA-seq data on annotation has been confined to major projects like ENCODE and Illumina Body Map 2.0.ResultsWe aligned 21,504 Illumina-sequenced human RNA-seq samples from the Sequence Read Archive (SRA) to the human genome and compared detected exon-exon junctions with junctions in several recent gene annotations. We found 56,861 junctions (18.6%) in at lea… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

12
106
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
7
3

Relationship

1
9

Authors

Journals

citations
Cited by 103 publications
(118 citation statements)
references
References 43 publications
12
106
0
Order By: Relevance
“…Extrapolating these observations to all splice types and genes suggests the existence of thousands yet unannotated exons in introns. This estimation is in accordance with a recent analysis of more than 20,000 human RNA‐Seq datasets that revealed over 55,000 junctions not present in annotations (Nellore et al , 2016). In this analysis, junctions found in at least 20 reads across all samples were termed “confidently called”.…”
Section: Discussionsupporting
confidence: 92%
“…Extrapolating these observations to all splice types and genes suggests the existence of thousands yet unannotated exons in introns. This estimation is in accordance with a recent analysis of more than 20,000 human RNA‐Seq datasets that revealed over 55,000 junctions not present in annotations (Nellore et al , 2016). In this analysis, junctions found in at least 20 reads across all samples were termed “confidently called”.…”
Section: Discussionsupporting
confidence: 92%
“…These are treasure troves of data waiting to be mined for new discoveries. Many are already tapping into this wealth, as exemplified by Nellore et al (2016), who examined splice junction variants in the human genome, and Chamala et al (2015) who examined conserved AS events across several species of eudicots. The method we describe here can be used to mine to identify conserved AS events within any species with an available genome sequence and deep RNA-Seq data.…”
Section: Discussionmentioning
confidence: 99%
“…Our results from applying Whippet indicate that high-entropy AS events occur more frequently in vertebrate transcriptomes than previously appreciated 12,31 , and further provide evidence that these events are likely biologically significant since their entropy levels are frequently tissue-regulated, conserved, and the corresponding variant transcripts are highly expressed. Many of the events are reminiscent of wellstudied examples of high-entropy AS in other systems, such as the myriad of splice variants generated by tandem arrays of alternative exons in the Drosophila DSCAM gene 32 .…”
Section: Discussionmentioning
confidence: 70%