2018 IEEE International Conference on Big Data (Big Data) 2018
DOI: 10.1109/bigdata.2018.8622637
|View full text |Cite
|
Sign up to set email alerts
|

Large-Scale Validation of Hypothesis Generation Systems via Candidate Ranking

Abstract: The first step of many research projects is to define and rank a short list of candidates for study. In the modern rapidity of scientific progress, some turn to automated hypothesis generation (HG) systems to aid this process. These systems can identify implicit or overlooked connections within a large scientific corpus, and while their importance grows alongside the pace of science, they lack thorough validation. Without any standard numerical evaluation method, many validate generalpurpose HG systems by redi… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
37
1

Year Published

2018
2018
2024
2024

Publication Types

Select...
4
2
1

Relationship

4
3

Authors

Journals

citations
Cited by 19 publications
(39 citation statements)
references
References 44 publications
1
37
1
Order By: Relevance
“…More recently, we proposed a number of metrics to evaluate a-c relevance in [16]. This work describes how a number of embedding-based relationships, further summarized in the following section, quantify the fruitfulness of an individual query.…”
Section: A Moliere Pipeline Backgroundmentioning
confidence: 99%
See 2 more Smart Citations
“…More recently, we proposed a number of metrics to evaluate a-c relevance in [16]. This work describes how a number of embedding-based relationships, further summarized in the following section, quantify the fruitfulness of an individual query.…”
Section: A Moliere Pipeline Backgroundmentioning
confidence: 99%
“…Due to this conceptual limitation, many projects validate their system by simply rediscovering a handful of "gold-standard" connections [31], [32], [33], [34]. Some few projects show their utility beyond the gold-standard by incorporating expert analysis and experiments [18], [35], [16]. While these results are important to show real-world application areas for hypothesis generation, lab work is time consuming, expensive, and clearly does not scale for large validation sets.…”
Section: B Metrics For Hypothesis Rankingmentioning
confidence: 99%
See 1 more Smart Citation
“…In the current analysis we queried the list of all human genes downloaded from HUGO (30) paired with HIV associated neurocognitive disorder. The generated hypotheses were ranked based on a number of techniques described in (24). The hypotheses ranking represent the level of association each gene has with HAND.…”
Section: Resultsmentioning
confidence: 99%
“…In addition to our own ranking methods, the algorithmic pipeline relies on several recently developed scalable machine learning methods that have not been adopted by other knowledge discovery systems such as low-dimensional representation manifold learning and scalable probabilistic topic modeling. (23,24) . In the present article, we performed MOLIERE analysis for possible links of human proteins with HAND in the biomedical literature.…”
mentioning
confidence: 99%