2007
DOI: 10.1093/bioinformatics/btm161
|View full text |Cite
|
Sign up to set email alerts
|

A quantitative model for linking two disparate sets of articles in MEDLINE

Abstract: Background: Identifying information that implicitly links two disparate sets of articles is a fundamental and intuitive data mining strategy that can help investigators address real scientific questions. The Arrowsmith two-node search finds title words and phrases (so-called B-terms) that are shared across two sets of articles within MEDLINE and displays them in a manner that facilitates human assessment. A serious stumbling-block has been the lack of a quantitative model for predicting which of the hundreds i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
63
0

Year Published

2009
2009
2020
2020

Publication Types

Select...
5
3

Relationship

4
4

Authors

Journals

citations
Cited by 55 publications
(66 citation statements)
references
References 31 publications
3
63
0
Order By: Relevance
“…In the Arrowsmith two-node search tool [19,20], the user seeks to assess a possible relationship between literatures A and C; the computer interface presents a list of terms (the -B-list‖) in common between the literatures to serve as a conceptual bridge. However, not all B-terms are likely to be of equal value in discovering significant implicit links.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…In the Arrowsmith two-node search tool [19,20], the user seeks to assess a possible relationship between literatures A and C; the computer interface presents a list of terms (the -B-list‖) in common between the literatures to serve as a conceptual bridge. However, not all B-terms are likely to be of equal value in discovering significant implicit links.…”
Section: Discussionmentioning
confidence: 99%
“…However, not all B-terms are likely to be of equal value in discovering significant implicit links. Characteristic terms expressed in each literature are computed as a feature in the quantitative model that allows us to rank the B-terms in order of predicted relevance to linking the two literatures in a meaningful way [19]. Moreover, B-terms that are not characteristic in either literature A or C are unlikely to indicate important concepts in either literature, whereas B-terms that are characteristic in both A and C may represent concepts that are already well known.…”
Section: Discussionmentioning
confidence: 99%
“…In both cases, the MeSH terms are searched without expansion to retrieve related terms. Arrowsmith software computes the words and phrases that are in common to the titles of articles in the two queries (ie., the B-terms), and uses a quantitative model (Torvik and Smalheiser, 2007) to estimate the predicted relevance of each B-term for linking the two queries in a meaningful way. The percentage of B-terms that are predicted to be relevant is the pR score.…”
Section: Methodsmentioning
confidence: 99%
“…In previous investigations, we have studied the potential benefit of identifying disparate areas of scientific investigation which are disconnected – that is, they reside in two different sets of articles that do not share authors, and are poorly cross-cited or co-cited – yet they contain information that, when connected, leads to promising and testable new hypotheses (Swanson and Smalheiser, 1997; Torvik and Smalheiser, 2007; Smalheiser et al, 2009). The presumption is that connections between disparate areas of investigation are likely to be overlooked (due to lack of reading widely enough by scientists), neglected (e.g.…”
Section: Introductionmentioning
confidence: 99%
“…The system is still being used in the medical research community and has been extended further to improve search results. For example, Torvik and Smalheiser (2007) describe the development of a quantitative regression model that uses additional criteria to establish and rank the probable relevance of any connections found by the tool.…”
Section: Computational Models and Toolsmentioning
confidence: 99%