2019
DOI: 10.1186/s12864-019-6272-2
|View full text |Cite
|
Sign up to set email alerts
|

GO2Vec: transforming GO terms and proteins to vector representations via graph embeddings

Abstract: Background: Semantic similarity between Gene Ontology (GO) terms is a fundamental measure for many bioinformatics applications, such as determining functional similarity between genes or proteins. Most previous research exploited information content to estimate the semantic similarity between GO terms; recently some research exploited word embeddings to learn vector representations for GO terms from a large-scale corpus. In this paper, we proposed a novel method, named GO2Vec, that exploits graph embeddings to… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
22
0

Year Published

2020
2020
2025
2025

Publication Types

Select...
8

Relationship

1
7

Authors

Journals

citations
Cited by 24 publications
(23 citation statements)
references
References 34 publications
0
22
0
Order By: Relevance
“…This tool enables the comparison of new GO-based semantic similarity measures against previously published ones considering their relation to sequence, Pfam ( 34 ) and Enzyme Commission (EC) ( 35 ) number similarity. CESSM was released in 2009 and updated in 2014, and since then it has been widely used by the community, being adopted to evaluate over 25 novel semantic similarity measures developed through different methods, with more recent ones focusing on common information content (IC)-based metrics ( 36 ) but also based on vector representations/graph embeddings ( 37 ). CESSM was built as a web-based tool to support the automatic comparison against the benchmark data.…”
Section: Related Workmentioning
confidence: 99%
“…This tool enables the comparison of new GO-based semantic similarity measures against previously published ones considering their relation to sequence, Pfam ( 34 ) and Enzyme Commission (EC) ( 35 ) number similarity. CESSM was released in 2009 and updated in 2014, and since then it has been widely used by the community, being adopted to evaluate over 25 novel semantic similarity measures developed through different methods, with more recent ones focusing on common information content (IC)-based metrics ( 36 ) but also based on vector representations/graph embeddings ( 37 ). CESSM was built as a web-based tool to support the automatic comparison against the benchmark data.…”
Section: Related Workmentioning
confidence: 99%
“…The edge prediction task is applied to the PPI prediction to find new protein interaction relationships. They also provide a basis for calculating protein similarity based on GO, such as GO2vec ( Zhong et al, 2019 ), which used the Node2vec algorithm to compute the functional similarity between proteins.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, several researchers have proposed word embeddings (e.g., word2vec [34] and GloVe [35]), which have been developed in the area of natural language processing, to learn vector representations of GO terms and proteins and then used learned vectors for the PPI prediction [36][37][38][39]. These methods mainly use the word2vec model [34] to learn vectors for each word from the corpus derived from descriptive axioms of GO terms and proteins; the descriptive axiom of a GO term is its textual description, for example, the descriptive axiom of the GO term "GO:0036388" is "pre-replicative complex assembly. "…”
mentioning
confidence: 99%
“…Finally, the vectors of proteins are used to predict the protein interactions. We have earlier proposed GO2Vec [39] that convert the GO graph into a vector space to represent genes for predicting their similarity.…”
mentioning
confidence: 99%
See 1 more Smart Citation