2023
DOI: 10.1101/2023.09.26.559473
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

DeepGO-SE: Protein function prediction as Approximate Semantic Entailment

Maxat Kulmanov,
Francisco J. Guzmán-Vega,
Paula Duek Roggli
et al.

Abstract: The Gene Ontology (GO) is one of the most successful ontologies in the biological domain. GO is a formal theory with over 100,000 axioms that describe the molecular functions, biological processes, and cellular locations of proteins in three sub-ontologies. Many methods have been developed to automatically predict protein functions. However, only few of them use the background knowledge provided in the axioms of GO for knowledge-enhanced machine learning, or adjust and evaluate the model for the differences be… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(3 citation statements)
references
References 51 publications
0
3
0
Order By: Relevance
“…A variety of compared methods are summarized in Table 4, from simplest naï ve methods to newest state-of-the-art method DeepGO-SE (43). AI-based methods mainly rely on protein features or protein sequences as inputs, supplemented with information like PPI or semantic sometimes.…”
Section: Competing Methodsmentioning
confidence: 99%
“…A variety of compared methods are summarized in Table 4, from simplest naï ve methods to newest state-of-the-art method DeepGO-SE (43). AI-based methods mainly rely on protein features or protein sequences as inputs, supplemented with information like PPI or semantic sometimes.…”
Section: Competing Methodsmentioning
confidence: 99%
“…Microbial samples are complex and contain many uncharacterized proteins. Previously, we developed DeepGO-SE [33], a method for protein function prediction using protein sequence embeddings generated by ESM2 [9] and approximate semantic entailment. We showed that DeepGO-SE can be applied to uncharacterized proteins; however, since it is trained on all experimentally annotated proteins form UniProt-KB/Swissprot database, many of the functions it predicts are not relevant to microbiomes and exist only in eukaryotic genomes.…”
Section: Deepgometamentioning
confidence: 99%
“…This is to ensure that our model is robust and effective in predicting the functions of these newly discovered proteins. We did this by comparing DeepGOMeta predictions on the newly annotated proteins with other state-of-the-art methods that predict functions based on protein language model embeddings and transformer-based deep learning models, including TALE [17], SPROF-GO [18] and DeepGO-SE [33]. We found that DeepGOMeta outperforms the DeepGO-SE method in all three sub-ontology evaluations and performs better than all the compared methods in the BPO and CCO evaluations in terms of F max and S min .…”
Section: Evaluation and Comparison On The Time-based Splitmentioning
confidence: 99%