2014
DOI: 10.1093/bioinformatics/btu472
|View full text |Cite
|
Sign up to set email alerts
|

The impact of incomplete knowledge on the evaluation of protein function prediction: a structured-output learning perspective

Abstract: Motivation: The automated functional annotation of biological macromolecules is a problem of computational assignment of biological concepts or ontological terms to genes and gene products. A number of methods have been developed to computationally annotate genes using standardized nomenclature such as Gene Ontology (GO). However, questions remain about the possibility for development of accurate methods that can integrate disparate molecular data as well as about an unbiased evaluation of these methods. One i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
42
0

Year Published

2015
2015
2019
2019

Publication Types

Select...
3
3
2
1

Relationship

3
6

Authors

Journals

citations
Cited by 47 publications
(42 citation statements)
references
References 14 publications
0
42
0
Order By: Relevance
“…Based on the Enrichr tool [48], we performed enrichment analysis for biological processes in the Gene Ontology (GO) database [3] (Table 6) and molecular pathways in the Reactome database [17] ( Table 7) that were enriched in our gene set (Fisher's exact test, p < 0.01 after multiple testing correction). Gene or protein functional ontologies are well known to be incomplete [40], necessitating us to look at DRG-enriched genes in non-enriched ontology terms and the literature, and showing the diverse roles played by these gene products. Several of these are already well studied in DRG biology.…”
Section: Hdrg and Mdrg Enriched Genes And A Conserved Evolutionary Simentioning
confidence: 99%
“…Based on the Enrichr tool [48], we performed enrichment analysis for biological processes in the Gene Ontology (GO) database [3] (Table 6) and molecular pathways in the Reactome database [17] ( Table 7) that were enriched in our gene set (Fisher's exact test, p < 0.01 after multiple testing correction). Gene or protein functional ontologies are well known to be incomplete [40], necessitating us to look at DRG-enriched genes in non-enriched ontology terms and the literature, and showing the diverse roles played by these gene products. Several of these are already well studied in DRG biology.…”
Section: Hdrg and Mdrg Enriched Genes And A Conserved Evolutionary Simentioning
confidence: 99%
“…It is important to mention that all functional similarity measures between proteins are susceptible to problems caused by incomplete [12] and noisy [45] experimental annotations. There is a small effect of annotation incompleteness on topological measures and a somewhat larger effect on unnormalized semantic distance [27]. However, compared with topological measures, semantic similarity avoids a form of double-counting of nodes caused by the directed acyclic graph structure of GO, and thus properly treats hierarchical dependencies in the ontology.…”
Section: Protein Function Prediction and Its Evaluationmentioning
confidence: 99%
“…This is due to the fact that annotations are not uniformly distributed among the proteins within an annotation corpus (and also vary among different organisms corpora), with some proteins being very well annotated while others have a single annotation. Both of these issues stem from incomplete annotations, which have been shown to have a signifi cant impact in the performance of information-theoretic measures [ 27 ]. Finally, SS approaches need to be aware of the impact that using electronic annotations (evidence code IEA) can have.…”
Section: Issues and Challenges In Ssmentioning
confidence: 99%