2013
DOI: 10.1021/pr400294c
|View full text |Cite
|
Sign up to set email alerts
|

Proteogenomic Database Construction Driven from Large Scale RNA-seq Data

Abstract: The advent of inexpensive RNA-Seq technologies and other deep sequencing technologies for RNA has the promise to radically improve genomic annotation, providing information on transcribed regions and splicing events in a variety of cellular conditions. Using MS based proteogenomics, many of these events can be confirmed directly at the protein level. However, the integration of large amounts of redundant RNA-seq data and mass spectrometry data poses a challenging problem. Our manuscript addresses this by const… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

2
119
0
1

Year Published

2014
2014
2018
2018

Publication Types

Select...
7

Relationship

1
6

Authors

Journals

citations
Cited by 110 publications
(122 citation statements)
references
References 25 publications
2
119
0
1
Order By: Relevance
“…Although the current Ensembl genebuild was improved by incorporating RNA-Seq data described by Collins et al, the RNA-Seq data were not resolved at the transcript level (2). We used Cufflinks, which generates potential multiple transcripts of a gene, to identify novel transcriptional isoforms for 22,585 Ensembl genes (transcripts with class code j). Transcripts from intergenic regions (u category) indicate potential novel genes.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…Although the current Ensembl genebuild was improved by incorporating RNA-Seq data described by Collins et al, the RNA-Seq data were not resolved at the transcript level (2). We used Cufflinks, which generates potential multiple transcripts of a gene, to identify novel transcriptional isoforms for 22,585 Ensembl genes (transcripts with class code j). Transcripts from intergenic regions (u category) indicate potential novel genes.…”
Section: Resultsmentioning
confidence: 99%
“…A detailed method of splice graph creation and conversion to an MS search compatible FASTA database can be found in Ref. 22. The splice graph database and the six-frame translated genome database were searched using the MS-GFDB (version 20120106) search algorithm.…”
Section: Rna-seq Data Analysis and Generation Of High-confidence Tranmentioning
confidence: 99%
See 1 more Smart Citation
“…The annotated genes remain as hypothetical sequences until validated through experiments. The experimental data might include transcriptomic data, which can improve the predicted gene models (Denoeud et al 2008;Gerstein et al 2010;Guo et al 2014;Kelkar et al 2014;Woo et al 2014;Wu et al 2014;Yu et al 2014;Linde et al 2015). More recently, proteomic data from mass spectrometry experiments have been used for validating protein-coding genes (Brunner et al 2007;Gupta et al 2008;Merrihew et al 2008;Chaerkady et al 2011;Kelkar et al 2011;Bock et al 2014;Castellana et al 2014;Kim et al 2014;Trapp et al 2014;Wilhelm et al 2014).…”
mentioning
confidence: 99%
“…Transcriptomic and proteomic data have been used for improving annotation of genes (Bock et al 2014;Kelkar et al 2014;Woo et al 2014;Wu et al 2014). However, the potential application of these data sets in the correction of incomplete genome assemblies has not been realized thus far, and only few studies have demonstrated utility of RNA-seq data in improving incomplete genome assemblies Xue et al 2013).…”
mentioning
confidence: 99%