BioKEEN: a library for learning and evaluating biological knowledge graph embeddings

Ali, Mehdi; Hoyt, Charles Tapley; Domingo‐Fernándéz, Daniel; Lehmann, Jens; Jabeen, Hajira

doi:10.1093/bioinformatics/btz117

Cited by 63 publications

(80 citation statements)

References 7 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…diseased patients, and controls). CLEP has adopted PyKEEN ( Ali et al, 2021 ) as the KGEM-software due to its wide range of functionalities (e.g. a large number of KGEMs, hyperparameter optimization functionalities).…”

Section: Methodsmentioning

confidence: 99%

“…Each workflow is both accessible through a command line interface (CLI) as well as programmatically, allowing users to input their own patient-level datasets and custom KGs. In total, CLEP offers three different methods for incorporating patients into the KG, all KGEMs available through PyKEEN ( Ali et al , 2021 ), and five ML classifiers. Furthermore, thanks to its flexible implementation, users can independently use each of its modules as well as incorporate classifiers tasks into the framework ( Supplementary Fig.…”

Section: Methodsmentioning

confidence: 99%

See 1 more Smart Citation

CLEP: a hybrid data- and knowledge-driven framework for generating patient representations

et al. 2021

Self Cite

View full text Add to dashboard Cite

As machine learning and artificial intelligence increasingly attain a larger number of applications in the biomedical domain, at their core, their utility depends on the data used to train them. Due to the complexity and high dimensionality of biomedical data, there is a need for approaches that combine prior knowledge around known biological interactions with patient data. Here, we present CLEP, a novel approach that generates new patient representations by leveraging both prior knowledge and patient-level data. First, given a patient-level dataset and a knowledge graph containing relations across features that can be mapped to the dataset, CLEP incorporates patients into the knowledge graph as new nodes connected to their most characteristic features. Next, CLEP employs knowledge graph embedding models to generate new patient representations that can ultimately be used for a variety of downstream tasks, ranging from clustering to classification. We demonstrate how using new patient representations generated by CLEP significantly improves performance in classifying between patients and healthy controls for a variety of machine learning models, as compared to the use of the original transcriptomics data. Furthermore, we also show how incorporating patients into a knowledge graph can foster the interpretation and identification of biological features characteristic of a specific disease or patient subgroup. Finally, we released CLEP as an open source Python package together with examples and documentation. Availability CLEP is available to the bioinformatics community as an open source Python package at https://github.com/hybrid-kg/clep under the Apache 2.0 License. Supplementary information Supplementary data are available at Bioinformatics online.

show abstract

Section: Methodsmentioning

confidence: 99%

Section: Methodsmentioning

confidence: 99%

CLEP: a hybrid data- and knowledge-driven framework for generating patient representations

et al. 2021

Self Cite

View full text Add to dashboard Cite

show abstract

“…Incorporating edge types during the learning process can help the model to differentiate between node types (e.g.., patients and biological entities) and node sub-types (e.g., diseased patients, and controls). CLEP has adopted PyKEEN [19] as the KGEM-software due to its wide range of functionalities (e.g., a large number of KGEMs, hyperparameter optimization functionalities).…”

Section: Generating New Patient Representationsmentioning

confidence: 99%

“…Each workflow is both accessible through a command line interface (CLI) as well as programmatically, allowing users to input their own patient-level datasets and custom KGs. In total, CLEP offers three different methods for incorporating patients into the KG, all KGEMs available through PyKEEN [19], and five ML classifiers. Furthermore, thanks to its flexible implementation, users can independently use each of its modules as well as incorporate classifiers tasks into the framework (Supplementary Figure 2) .…”

Section: Software Implementationmentioning

confidence: 99%

CLEP: A Hybrid Data- and Knowledge-Driven Framework for Generating Patient Representations

Bharadhwaj

Ali

Birkenbihl

et al. 2020

Preprint

Self Cite

View full text Add to dashboard Cite

As machine learning and artificial intelligence become more useful in the interpretation of biomedical data, their utility depends on the data used to train them. Due to the complexity and high dimensionality of biomedical data, there is a need for approaches that combine prior knowledge around known biological interactions with patient data. Here, we present CLEP, a novel approach that generates new patient representations by leveraging both prior knowledge and patient-level data. First, given a patient-level dataset and a knowledge graph containing relations across features that can be mapped to the dataset, CLEP incorporates patients into the knowledge graph as new nodes connected to their most characteristic features. Next, CLEP employs knowledge graph embedding models to generate new patient representations that can ultimately be used for a variety of downstream tasks, ranging from clustering to classification. We demonstrate how using new patient representations generated by CLEP significantly improves performance in classifying between cognitively impaired patients and healthy controls for a variety of machine learning models, as compared to the use of the original transcriptomics data. Furthermore, we also show how incorporating patients into a knowledge graph can foster the interpretation and identification of biological features characteristic of a specific disease or patient subgroup. Finally, we released CLEP as an open source Python package together with examples and documentation.

show abstract

“…Biological knowledge formalized as a network can be used by clinicians as research and information retrieval tools, by biologists to propose in vitro and in vivo experiments, and by bioinformaticians to analyze high throughput -omics experiments (Catlett et al, 2013;Ali et al, 2019). Further, they can be readily semantically integrated with databases and other systems biology resources to improve their ability to accomplish each of these tasks (Hoyt et al, 2018).…”

Section: Introductionmentioning

confidence: 99%

A Computational Approach for Mapping Heme Biology in the Context of Hemolytic Disorders

Humayun

Domingo‐Fernándéz

George

et al. 2020

Front. Bioeng. Biotechnol.

Self Cite

View full text Add to dashboard Cite

Heme is an iron ion-containing molecule found within hemoproteins such as hemoglobin and cytochromes that participates in diverse biological processes. Although excessive heme has been implicated in several diseases including malaria, sepsis, ischemiareperfusion, and disseminated intravascular coagulation, little is known about its regulatory and signaling functions. Furthermore, the limited understanding of heme's role in regulatory and signaling functions is in part due to the lack of curated pathway resources for heme cell biology. Here, we present two resources aimed to exploit this unexplored information to model heme biology. The first resource is a terminology covering heme-specific terms not yet included in standard controlled vocabularies. Using this terminology, we curated and modeled the second resource, a mechanistic knowledge graph representing the heme's interactome based on a corpus of 46 scientific articles. Finally, we demonstrated the utility of these resources by investigating the role of heme in the Toll-like receptor signaling pathway. Our analysis proposed a series of crosstalk events that could explain the role of heme in activating the TLR4 signaling pathway. In summary, the presented work opens the door to the scientific community for exploring the published knowledge on heme biology.

show abstract

BioKEEN: a library for learning and evaluating biological knowledge graph embeddings

Cited by 63 publications

References 7 publications

CLEP: a hybrid data- and knowledge-driven framework for generating patient representations

CLEP: a hybrid data- and knowledge-driven framework for generating patient representations

CLEP: A Hybrid Data- and Knowledge-Driven Framework for Generating Patient Representations

A Computational Approach for Mapping Heme Biology in the Context of Hemolytic Disorders

Contact Info

Product

Resources

About