2019
DOI: 10.1186/s12859-019-3060-6
|View full text |Cite
|
Sign up to set email alerts
|

Machine learning for discovering missing or wrong protein function annotations

Abstract: Background A massive amount of proteomic data is generated on a daily basis, nonetheless annotating all sequences is costly and often unfeasible. As a countermeasure, machine learning methods have been used to automatically annotate new protein functions. More specifically, many studies have investigated hierarchical multi-label classification (HMC) methods to predict annotations, using the Functional Catalogue (FunCat) or Gene Ontology (GO) label hierarchies. Most of these studies employed ben… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

2
22
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 28 publications
(24 citation statements)
references
References 44 publications
2
22
0
Order By: Relevance
“…An early attempt of using ML in genes functional annotation from biomedical literature utilized Hierarchical Text Categorization (HTC) [ 98 ], while Tetko et al provided a high-quality curated functional annotation data as a benchmark dataset for the developers of machine ML-based functional annotation methods for bacterial genomes [ 99 ]. The recent reports show the applications of ML-based methods in a wide variety of functional annotations such as the discovery of missing or wrong protein function annotations [ 100 ], predicting gene functions in plant [ 101 ], controlling the false discovery rate (FDR), increase the accuracy of protein functional predictions [ 102 ], and genome-wide functional annotation of splice-variants in eukaryotes [ 103 ].…”
Section: Integrating Artificial Intelligence In Metabolic Engineeringmentioning
confidence: 99%
“…An early attempt of using ML in genes functional annotation from biomedical literature utilized Hierarchical Text Categorization (HTC) [ 98 ], while Tetko et al provided a high-quality curated functional annotation data as a benchmark dataset for the developers of machine ML-based functional annotation methods for bacterial genomes [ 99 ]. The recent reports show the applications of ML-based methods in a wide variety of functional annotations such as the discovery of missing or wrong protein function annotations [ 100 ], predicting gene functions in plant [ 101 ], controlling the false discovery rate (FDR), increase the accuracy of protein functional predictions [ 102 ], and genome-wide functional annotation of splice-variants in eukaryotes [ 103 ].…”
Section: Integrating Artificial Intelligence In Metabolic Engineeringmentioning
confidence: 99%
“…In this paper, we only used the hierarchy GO dataset up to five levels. In our future work, we will apply the proposed model to GO datasets with more levels . Moreover, we will carry out a study on applying other deep learning models, especially transfer learning for protein function predictions.…”
Section: Discussionmentioning
confidence: 99%
“…We used ten HMC protein function datasets modeled as a tree. They are commonly used to evaluate hierarchical multi-label classifiers [9], and are freely available 1 . They come with already prepared training, validation and test partitions, which have been used by many works in the literature.…”
Section: Datasetsmentioning
confidence: 99%