2019
DOI: 10.1186/s13059-019-1710-7
|View full text |Cite
|
Sign up to set email alerts
|

Bridging the gap between reference and real transcriptomes

Abstract: Genetic, transcriptional, and post-transcriptional variations shape the transcriptome of individual cells, rendering establishing an exhaustive set of reference RNAs a complicated matter. Current reference transcriptomes, which are based on carefully curated transcripts, are lagging behind the extensive RNA variation revealed by massively parallel sequencing. Much may be missed by ignoring this unreferenced RNA diversity. There is plentiful evidence for non-reference transcripts with important phenotypic effec… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
39
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 51 publications
(39 citation statements)
references
References 90 publications
0
39
0
Order By: Relevance
“…The human reference gene set of NCBI RefSeq contains 5.6 transcripts per coding locus, many more than the plant (1.7 t/l) or pig (3.0 t/l) reference sets. Yet the 113,000 transcripts of this reference set is half that found by RNA-seq assembly projects (Morillon & Gautheret 2019), with evidence of biological effects in those additional transcripts. With large effort, the human gene set is now the most accurate and complete of complex eukaryotes, implying that alternate and paralog transcripts are under-reported in most animal and plant gene sets.…”
Section: D: Value Of Accuracy For Alternates and Paralogsmentioning
confidence: 89%
See 1 more Smart Citation
“…The human reference gene set of NCBI RefSeq contains 5.6 transcripts per coding locus, many more than the plant (1.7 t/l) or pig (3.0 t/l) reference sets. Yet the 113,000 transcripts of this reference set is half that found by RNA-seq assembly projects (Morillon & Gautheret 2019), with evidence of biological effects in those additional transcripts. With large effort, the human gene set is now the most accurate and complete of complex eukaryotes, implying that alternate and paralog transcripts are under-reported in most animal and plant gene sets.…”
Section: D: Value Of Accuracy For Alternates and Paralogsmentioning
confidence: 89%
“…Dougherty et al 2018 for cross-spliced paralogs). Long-read, single molecule sequencing methods have demonstrated value at recovering accurate transcripts, but as Morillon & Gautheret (2019) indicate, cost-effective and accurate short-read RNA data is an abundant resource for improved computational reconstruction of transcripts. The uncertainty in measurement accuracy of transcripts can be reduced with additional evidence evaluations, including replication over methods, and over biological samples.…”
Section: D: Value Of Accuracy For Alternates and Paralogsmentioning
confidence: 99%
“…These mechanisms occur in various physiological or pathological processes such as development, cancer and immunity [4,5,6]. To date, there is no finite list of long non-coding isoforms, making it difficult to construct a complete lncRNA catalogue due to the high number of transcripts and their tissue-specific expression [7,8]. The absence of a complete catalogue makes it difficult to establish a comprehensive lncRNA expression profile.Thus, currently, the best strategy for the study of lncRNAs consists in the prediction of transcripts from a selection of RNAseq data in a tissue-specific condition.…”
Section: Introductionmentioning
confidence: 99%
“…Consequently, it is important to understand how TEs contribute to the coding and non-coding transcriptome (You et al 2017;Ma et al 2019;Morillon and Gautheret 2019), if hPSCs are to be used in cell replacement therapy (Fort et al 2014;Schumann et al 2019). However, due to limitations in short-read based sequencing, accurate transcriptome maps have been elusive.…”
Section: Introductionmentioning
confidence: 99%