2022
DOI: 10.1038/s41467-022-33026-0
|View full text |Cite
|
Sign up to set email alerts
|

Integrating and formatting biomedical data as pre-calculated knowledge graph embeddings in the Bioteque

Abstract: Biomedical data is accumulating at a fast pace and integrating it into a unified framework is a major challenge, so that multiple views of a given biological event can be considered simultaneously. Here we present the Bioteque, a resource of unprecedented size and scope that contains pre-calculated biomedical descriptors derived from a gigantic knowledge graph, displaying more than 450 thousand biological entities and 30 million relationships between them. The Bioteque integrates, harmonizes, and formats data … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
29
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1
1

Relationship

1
6

Authors

Journals

citations
Cited by 39 publications
(30 citation statements)
references
References 84 publications
1
29
0
Order By: Relevance
“…Thus, RD diagnosis can become a “needle in a haystack” task where the solution is still far connected to current knowledge. In consequence, to augment their efficiency in the difficult cases, the gene-disease prediction methods are encouraged to use new and various types of functional annotations and combine them accurately (15, 16, 61, 62). We first confirmed that different types of gene-gene functional association networks had different capabilities in recovering known genes associated to a large collection of RDs.…”
Section: Discussionmentioning
confidence: 99%
See 3 more Smart Citations
“…Thus, RD diagnosis can become a “needle in a haystack” task where the solution is still far connected to current knowledge. In consequence, to augment their efficiency in the difficult cases, the gene-disease prediction methods are encouraged to use new and various types of functional annotations and combine them accurately (15, 16, 61, 62). We first confirmed that different types of gene-gene functional association networks had different capabilities in recovering known genes associated to a large collection of RDs.…”
Section: Discussionmentioning
confidence: 99%
“…If unavailable, edge weights are all set up to 1 (Table S1). Thus, for networks describing human gene phenotype similarity using HPO (32) and phenotype similarity using mouse orthologs from the Mouse Genome Informatics (MGI) (16) we calculated Jaccard similarity for each pair of genes sharing at least one HPO term and constructed a null distribution of Jaccard values to compute z-scores. Significant gene interactions (z-score>1.96; p-values<0.05) were selected to generate the network of phenotypes.…”
Section: Compilation Of Gene-gene Functional Associations From Public...mentioning
confidence: 99%
See 2 more Smart Citations
“…The challenges are exacerbated when input data are collected from multiple sources in the public domain. To ease this task, large biological knowledge graphs have been suggested as an intuitive framework for data integration, followed by graph embedding techniques capable of expressing the context of each entity (e.g., a compound or a gene) in a dense vectorial format. In brief, vector “embeddings” capture the interactions and statistical correlations found in the knowledge graph, such that highly connected nodes will have similar vector embeddings.…”
Section: Molecular Glue Degrader Discovery Via Advanced Phenotypic Sc...mentioning
confidence: 99%