Proceedings of the 19th SIGBioMed Workshop on Biomedical Language Processing 2020
DOI: 10.18653/v1/2020.bionlp-1.18
|View full text |Cite
|
Sign up to set email alerts
|

Benchmark and Best Practices for Biomedical Knowledge Graph Embeddings

Abstract: Much of biomedical and healthcare data is encoded in discrete, symbolic form such as text and medical codes. There is a wealth of expert-curated biomedical domain knowledge stored in knowledge bases and ontologies, but the lack of reliable methods for learning knowledge representation has limited their usefulness in machine learning applications. While text-based representation learning has significantly improved in recent years through advances in natural language processing, attempts to learn biomedical conc… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

3
17
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
2
1

Relationship

2
6

Authors

Journals

citations
Cited by 21 publications
(20 citation statements)
references
References 17 publications
3
17
0
Order By: Relevance
“…Our results show RotatE is often the best performing model of the five on both datasets and throughout all the experiments. This reinforces previous similar findings [11,2] and shows RotatE to be a strong baseline in the context of drug discovery. Our results also highlight that older approaches like TransE can still be very competitive given an optimised training and hyperparameter setup.…”
Section: Discussionsupporting
confidence: 92%
See 1 more Smart Citation
“…Our results show RotatE is often the best performing model of the five on both datasets and throughout all the experiments. This reinforces previous similar findings [11,2] and shows RotatE to be a strong baseline in the context of drug discovery. Our results also highlight that older approaches like TransE can still be very competitive given an optimised training and hyperparameter setup.…”
Section: Discussionsupporting
confidence: 92%
“…The performance of five knowledge graph embedding approaches (TransE, ComplEx, DistMult, SimplE and RotatE) have been compared on a knowledge graph constructed from the SNOMED resource [11]. The models are assessed on the tasks of link prediction, visualisation and entity classification, with a limited grid-search being performed to choose the hyperparameters.…”
Section: Biomedical Domain Specific Evaluationsmentioning
confidence: 99%
“…The BERT model is a neural network transformer‐based model that generates embeddings for sentences to capture meaning. We used Clinical BERT, a variant of BERT fine‐tuned on biomedical and clinical text corpora (MIMIC‐III and PubMed) 14 …”
Section: Methodsmentioning
confidence: 99%
“…Essentially, a KGE model maps entities and relations to embedding spaces by using a predefined scoring function. Due to their growing popularity and the availability of implementation methods, KGEs have recently been applied to various domains, including biomedical knowledge graphs [ 26 ]. Chang et al [ 26 ] showed that using KGEs for learning concept embeddings from medical terminologies and knowledge graphs is arguably a more principled and effective approach than using previous methods based on skip-gram–based models like Cui2Vec [ 27 ] or network embedding–based models like Snomed2Vec [ 28 ].…”
Section: Methodsmentioning
confidence: 99%
“…Due to their growing popularity and the availability of implementation methods, KGEs have recently been applied to various domains, including biomedical knowledge graphs [ 26 ]. Chang et al [ 26 ] showed that using KGEs for learning concept embeddings from medical terminologies and knowledge graphs is arguably a more principled and effective approach than using previous methods based on skip-gram–based models like Cui2Vec [ 27 ] or network embedding–based models like Snomed2Vec [ 28 ]. Although we initially used Cui2Vec for our entity vectors at the time of submission, we later used SNOMED CT KGEs after they became available in recent months.…”
Section: Methodsmentioning
confidence: 99%