Structuring and extracting knowledge for the support of hypothesis generation in molecular biology

Roos, Marco; Marshall, M. Scott; Gibson, Andrew; Schuemie, Martijn J.; Meij, Edgar; Katrenko, Sophia; Hage, Willem Robert van; Krommydas, Konstantinos F.; Adriaans, Pieter

doi:10.1186/1471-2105-10-s10-s9

Cited by 16 publications

(13 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For this reason, participating resources are being offered the opportunity to make use of linked data and semantic web approaches: computational standards and methods that enable them to make their data more accessible and interoperable. 28 …”

Section: Generation Of a Comprehensive Searchable Online Catalogue mentioning

confidence: 99%

RD-Connect: An Integrated Platform Connecting Databases, Registries, Biobanks and Clinical Bioinformatics for Rare Disease Research

et al. 2014

View full text Add to dashboard Cite

Section: Generation Of a Comprehensive Searchable Online Catalogue mentioning

confidence: 99%

RD-Connect: An Integrated Platform Connecting Databases, Registries, Biobanks and Clinical Bioinformatics for Rare Disease Research

et al. 2014

View full text Add to dashboard Cite

“…Hypothesis generation is defined as "the pre-decisional process by which it is possible to formulate explanations and beliefs regarding the occurrences observed in a specific environment" [20]. Systems presented in the literature can be classified according to different dimensions: (i) manual or automatic, (ii) domain-specific or domain-independent and (iii) ontology-or Linked Data-driven.…”

Section: Foundations and Related Workmentioning

confidence: 99%

Dedalo: Looking for Clusters Explanations in a Labyrinth of Linked Data

Tiddi

d’Aquin

Motta

2014

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Abstract. We present Dedalo, a framework which is able to exploit Linked Data to generate explanations for clusters. In general, any result of a Knowledge Discovery process, including clusters, is interpreted by human experts who use their background knowledge to explain them. However, for someone without such expert knowledge, those results may be difficult to understand. Obtaining a complete and satisfactory explanation becomes a laborious and time-consuming process, involving expertise in possibly different domains. Having said so, not only does the Web of Data contain vast amounts of such background knowledge, but it also natively connects those domains. While the efforts put in the interpretation process can be reduced with the support of Linked Data, how to automatically access the right piece of knowledge in such a big space remains an issue. Dedalo is a framework that dynamically traverses Linked Data to find commonalities that form explanations for items of a cluster. We have developed different strategies (or heuristics) to guide this traversal, reducing the time to get the best explanation.In our experiments, we compare those strategies and demonstrate that Dedalo finds relevant and sophisticated Linked Data explanations from different areas.

show abstract

“…Reaching beyond biomedical data integration including the scientific literature, recent visionary developments propose to expose results and findings early on as factual statements in a fixed format (“nanopublications”, “proto-ontologies”, “microparadigms”) and where any data set should have the potential to be referenced and reused electronically from any world-wide access point (digital object identifiers, DOIs for data) [11]-[14]. The representation of the data either follows data formats or requires meta-data for the correct annotation of its origins and experimental settings, but then contributes to the generation and evaluation of hypotheses [15], [16].…”

Section: Introductionmentioning

confidence: 99%

Evaluation and Cross-Comparison of Lexical Entities of Biological Interest (LexEBI)

et al. 2013

View full text Add to dashboard Cite

MotivationBiomedical entities, their identifiers and names, are essential in the representation of biomedical facts and knowledge. In the same way, the complete set of biomedical and chemical terms, i.e. the biomedical “term space” (the “Lexeome”), forms a key resource to achieve the full integration of the scientific literature with biomedical data resources: any identified named entity can immediately be normalized to the correct database entry. This goal does not only require that we are aware of all existing terms, but would also profit from knowing all their senses and their semantic interpretation (ambiguities, nestedness).ResultThis study compiles a resource for lexical terms of biomedical interest in a standard format (called “LexEBI”), determines the overall number of terms, their reuse in different resources and the nestedness of terms. LexEBI comprises references for protein and gene entries and their term variants and chemical entities amongst other terms. In addition, disease terms have been identified from Medline and PubmedCentral and added to LexEBI. Our analysis demonstrates that the baseforms of terms from the different semantic types show only little polysemous use. Nonetheless, the term variants of protein and gene names (PGNs) frequently contain species mentions, which should have been avoided according to protein annotation guidelines. Furthermore, the protein and gene entities as well as the chemical entities, both do comprise enzymes leading to hierarchical polysemy, and a large portion of PGNs make reference to a chemical entity. Altogether, according to our analysis based on the Medline distribution, 401,869 unique PGNs in the documents contain a reference to 25,022 chemical entities, 3,125 disease terms or 1,576 species mentions.ConclusionLexEBI delivers the complete biomedical and chemical Lexeome in a standardized representation (http://www.ebi.ac.uk/Rebholz-srv/LexEBI/). The resource provides the disease terms as open source content, and fully interlinks terms across resources.

show abstract

Structuring and extracting knowledge for the support of hypothesis generation in molecular biology

Cited by 16 publications

References 29 publications

RD-Connect: An Integrated Platform Connecting Databases, Registries, Biobanks and Clinical Bioinformatics for Rare Disease Research

RD-Connect: An Integrated Platform Connecting Databases, Registries, Biobanks and Clinical Bioinformatics for Rare Disease Research

Dedalo: Looking for Clusters Explanations in a Labyrinth of Linked Data

Evaluation and Cross-Comparison of Lexical Entities of Biological Interest (LexEBI)

Contact Info

Product

Resources

About