2023
DOI: 10.1038/s41597-023-01960-3
|View full text |Cite
|
Sign up to set email alerts
|

Building a knowledge graph to enable precision medicine

Abstract: Developing personalized diagnostic strategies and targeted treatments requires a deep understanding of disease biology and the ability to dissect the relationship between molecular and genetic factors and their phenotypic consequences. However, such knowledge is fragmented across publications, non-standardized repositories, and evolving ontologies describing various scales of biological organization between genotypes and clinical phenotypes. Here, we present PrimeKG, a multimodal knowledge graph for precision … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
102
1

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
2
1

Relationship

1
6

Authors

Journals

citations
Cited by 152 publications
(104 citation statements)
references
References 109 publications
1
102
1
Order By: Relevance
“…The embedding vectors are first reduced to two dimensions via PCA and then visualized using t-SNE ( Van der Maaten and Hinton, 2008 ). We visualize 46 cancer drugs ( https://www.cancer.gov/about-cancer/treatment/drugs ) and 26 psychotropic drugs ( https://www.healthpartners.com/ucm/groups/public/@hp/@public/documents/documents/entry_194823.pdf ), which are commonly used in clinical settings, and their related top-30 indications reported on PrimeKG dataset ( Chandak et al , 2022 ) in Figure 2a and their top-30 reported side effects on the SIDER 4.1 dataset ( Kuhn et al , 2016 ) in Figure 2b . Meanwhile, we also visualize all EHR concepts and Drugbank molecules in Supplementary Figure S3a , showing that they are well-aligned in the embedding space.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The embedding vectors are first reduced to two dimensions via PCA and then visualized using t-SNE ( Van der Maaten and Hinton, 2008 ). We visualize 46 cancer drugs ( https://www.cancer.gov/about-cancer/treatment/drugs ) and 26 psychotropic drugs ( https://www.healthpartners.com/ucm/groups/public/@hp/@public/documents/documents/entry_194823.pdf ), which are commonly used in clinical settings, and their related top-30 indications reported on PrimeKG dataset ( Chandak et al , 2022 ) in Figure 2a and their top-30 reported side effects on the SIDER 4.1 dataset ( Kuhn et al , 2016 ) in Figure 2b . Meanwhile, we also visualize all EHR concepts and Drugbank molecules in Supplementary Figure S3a , showing that they are well-aligned in the embedding space.…”
Section: Methodsmentioning
confidence: 99%
“…We experiment on PrimeKG ( Chandak et al , 2022 ), which includes annotated drug-indication relations and SIDER 4.1 ( Kuhn et al , 2016 ), which consists of reported side effects for joint training. There are 349 indications included that are observed by at least three drugs in PrimeKG.…”
Section: Methodsmentioning
confidence: 99%
“…-PrimeKG [36]: A knowledge graph dataset integrating 20 high-quality datasets, biorepositories, and ontologies.…”
Section: Preprocessingmentioning
confidence: 99%
“…Feature Representation Unsupervised learning shows brilliant results in the field of Natural Language Processing and bioinformatics, many sequence-based biological tasks benefit from it. Inspired by the progress of unsupervised frameworks, we take advantage of three models with large-scale information: ESM-1b [34], KPGT [33], and PrimeKG [36] to obtain the initial embedding of our protein, drug, and disease respectively. Fig.…”
Section: Preprocessingmentioning
confidence: 99%
See 1 more Smart Citation