2018
DOI: 10.1038/s41467-018-03311-y
|View full text |Cite|
|
Sign up to set email alerts
|

Discovery of coding regions in the human genome by integrated proteogenomics analysis workflow

Abstract: Proteogenomics enable the discovery of novel peptides (from unannotated genomic protein-coding loci) and single amino acid variant peptides (derived from single-nucleotide polymorphisms and mutations). Increasing the reliability of these identifications is crucial to ensure their usefulness for genome annotation and potential application as neoantigens in cancer immunotherapy. We here present integrated proteogenomics analysis workflow (IPAW), which combines peptide discovery, curation, and validation. IPAW in… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

6
145
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
2

Relationship

3
4

Authors

Journals

citations
Cited by 131 publications
(151 citation statements)
references
References 69 publications
6
145
0
Order By: Relevance
“…With the advent of deep sequencing strategies in both genomics and proteomics fields, we are now discovering nORFs that have remained undiscovered or 'hidden' 1,2,5,6 . These nORFs are pervasive throughout the genome and are observed in both the coding and non coding regions 1,7 . They are variously classified as small ORFs (sORFs) 8,9 which are 1-100 amino acids in length, altORFs 10 , which are proteins in alternate frames to known proteins, Denovogenes 11 or Orphan genes 12 , Pseudogenes 1,13 , and many ncRNAs have been shown to have coding potential [14][15][16][17][18] .…”
mentioning
confidence: 99%
“…With the advent of deep sequencing strategies in both genomics and proteomics fields, we are now discovering nORFs that have remained undiscovered or 'hidden' 1,2,5,6 . These nORFs are pervasive throughout the genome and are observed in both the coding and non coding regions 1,7 . They are variously classified as small ORFs (sORFs) 8,9 which are 1-100 amino acids in length, altORFs 10 , which are proteins in alternate frames to known proteins, Denovogenes 11 or Orphan genes 12 , Pseudogenes 1,13 , and many ncRNAs have been shown to have coding potential [14][15][16][17][18] .…”
mentioning
confidence: 99%
“…FXR2 and NPLOC4 were identified with an upstream peptide, which was in‐frame with annotated start site (Figure S1, Supporting Information). Recent studies have also reported putative alternate N‐termini for these proteins . For the remaining five proteins, we identified peptides that were direct extensions from annotated protein N‐termini into the 5’‐UTR region which suggests the existence of an alternate N‐terminus or multiple N‐termini.…”
Section: Resultsmentioning
confidence: 71%
“…A few representative tissue-specific pseudogenes/ lncRNAs are listed in Table 1. We detected two previously reported tissue specific non-coding gene translation, testis specific TATDN2P1 (TatD DNase domain containing 2 pseudogene 1, supported by two unique peptides) and placenta specific lncRNA lnc-CACNG8-28:1 (supported by eight unique peptides) 4 . In addition, some more novel tissue specific non-coding gene encoded peptides were discovered in different tissues (supplementary table S2).…”
Section: Proteomics Detects Ubiquitous and Tissue-specific Translatiomentioning
confidence: 76%
“…Recently, many mass spectrometry based proteomics studies have identified peptides from non-coding regions of human genome [1][2][3][4][5] .Some peptides are identified from genomic regions in proximity to protein-coding genes, which indicate incorrect exon boundary or a missed exon. Others are identified from currently annotated non-coding sequences include pseudogenes, lncRNAs, protein-coding gene's untranslated region, alternative reading frame or anti-sense strand.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation