2022
DOI: 10.1093/bioinformatics/btac821
|View full text |Cite
|
Sign up to set email alerts
|

Discovering misannotated lncRNAs using deep learning training dynamics

Abstract: Motivation Recent experimental evidence has shown that some long noncoding RNAs (lncRNAs) contain small open reading frames (sORFs) that are translated into functional micropeptides, suggesting that these lncRNAs are misannotated as noncoding. Current methods to detect misannotated lncRNAs rely on ribosome-profiling (Ribo-Seq) and mass-spectrometry experiments, which are cell-type dependent and expensive. Results Here, we pro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 46 publications
0
3
0
Order By: Relevance
“…To deliver on this promise, the set of computational and experimental tools to find and functionally validate ncORFs and their miniprotein products requires ongoing expansion and improvement. While current identification of ncORFs mainly relies on Ribo‐seq and mass spectrometry, new computational methods (Nabi et al, 2023 ) potentiate higher‐confidence and higher‐throughput discovery. For instance, the TIS Transformer program, which uses artificial intelligence to map the human genome, was able to detect ncORFs and predict those that encode miniproteins with high performance (Clauwaert et al, 2023 ).…”
Section: Discussionmentioning
confidence: 99%
“…To deliver on this promise, the set of computational and experimental tools to find and functionally validate ncORFs and their miniprotein products requires ongoing expansion and improvement. While current identification of ncORFs mainly relies on Ribo‐seq and mass spectrometry, new computational methods (Nabi et al, 2023 ) potentiate higher‐confidence and higher‐throughput discovery. For instance, the TIS Transformer program, which uses artificial intelligence to map the human genome, was able to detect ncORFs and predict those that encode miniproteins with high performance (Clauwaert et al, 2023 ).…”
Section: Discussionmentioning
confidence: 99%
“…It does not require a prelabelled dataset consisting of positively identified but misannotated lncRNAs. In contrast, it is based on Ribo-Seq data, which are used to discern lncRNAs that have been previously misannotated [ 52 ]. Of course, although bioinformatics approaches are essential for identifying sORFs, relying on only one technique to predict the coding potential of sORFs in a specific study is insufficient, and other experimental methods should be combined as appropriate to confirm whether these sORFs can be translated into small functional proteins.…”
Section: Prediction Of Micropeptide Coding Potentialmentioning
confidence: 99%
“…Generation of a reference set incorporated along with the known ORFs would allow researchers to use publicly available RNAseq data to infer which smORFs are differentially regulated in a given disease or perturbation of interest without having to perform Ribo-seq and call smORFs in every new system of study. For those interested to investigate lncRNA coding potential, several sequence- and evolutionary conservation-based tools [ 90 , 91 ], and more recently deep learning models, have been developed [ 92 , 93 ] to identify cryptic ORFs using in silico prediction. As has recently become evident, lncRNAs tend to encode young proteins [ 19 ] and thus traditional ORF prediction methods which rely on length-biased and evolutionary conservation-biased methods would not be able to discern coding potential efficiently [ 46 ].…”
Section: Road Ahead and Challengesmentioning
confidence: 99%