2020
DOI: 10.1101/2020.10.16.343012
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

SKiM - A generalized literature-based discovery system for uncovering novel biomedical knowledge from PubMed

Abstract: Literature-based discovery (LBD) uncovers undiscovered public knowledge by linking terms A to C via a B intermediate. Existing LBD systems are limited to process certain A, B, and C terms, and many are not maintained. We present SKiM (Serial KinderMiner), a generalized LBD system for processing any combination of A, Bs, and Cs. We evaluate SKiM via the rediscovery of discoveries by Don Swanson, who pioneered LBD. Using only literature from the 19th century up to a year before Swanson's discoveries, SKiM uncove… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 46 publications
0
4
0
Order By: Relevance
“…In order to identify duplication events potentially linked to body size and longevity, we selected 283 genes whose predicted copy numbers differ at least 2-fold between blue whale and vaquita ( supplementary table S5, Supplementary Material online). To prioritize variants, we identified 8,649 candidate genes linked to body size, development, longevity, and susceptibility to cancer ( supplementary table S6, Supplementary Material online) from published studies in whales ( Tollis et al 2019 ; Lagunas-Rangel 2021 ), dogs ( Ostrander et al 2017 ), cattle ( Bouwman et al 2018 ), and sheep ( Kominakis et al 2017 ), and automated literature mining engines ( Kuusisto et al 2020 ; Raja et al 2020 ). Intersecting these 2 lists identified 133 genes of potential interest ( supplementary table S7, Supplementary Material online).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…In order to identify duplication events potentially linked to body size and longevity, we selected 283 genes whose predicted copy numbers differ at least 2-fold between blue whale and vaquita ( supplementary table S5, Supplementary Material online). To prioritize variants, we identified 8,649 candidate genes linked to body size, development, longevity, and susceptibility to cancer ( supplementary table S6, Supplementary Material online) from published studies in whales ( Tollis et al 2019 ; Lagunas-Rangel 2021 ), dogs ( Ostrander et al 2017 ), cattle ( Bouwman et al 2018 ), and sheep ( Kominakis et al 2017 ), and automated literature mining engines ( Kuusisto et al 2020 ; Raja et al 2020 ). Intersecting these 2 lists identified 133 genes of potential interest ( supplementary table S7, Supplementary Material online).…”
Section: Resultsmentioning
confidence: 99%
“…We identified genes linked to body size, development, longevity, and susceptibility to cancer by manual review of published studies in whales ( Tollis et al 2019 ; Lagunas-Rangel 2021 ), dogs ( Ostrander et al 2017 ), cattle ( Bouwman et al 2018 ), and sheep ( Kominakis et al 2017 ). We also used KinderMiner ( Kuusisto et al 2020 ) and SKiM ( Raja et al 2020 ) literature mining systems to identify genes linked to relevant search terms, including “developmental clock”, “body size”, “dwarfism”, “gigantism”, “longevity”, “cancer”, “growth”, and “overgrowth”. We also searched a curated gene-disease database for genes linked to dwarfism.…”
Section: Methodsmentioning
confidence: 99%
“…Because our group uses Nile rat to study type 2 diabetes, we developed a list of genes broadly relevant for this disease. This list was compiled from gene-disease databases [35,36], GWAS catalog from EMBL-EBI [37], and two different text-mining methods [38,39], resulting in a total of 4396 genes (Additional file 1: Fig. S1) [40][41][42][43][44][45][46][47].…”
Section: Compilation Of Type 2 Diabetes Associated Genesmentioning
confidence: 99%
“…Because the Nile rat is an important animal model for type 2 diabetes, we developed a list of genes potentially connected to type 2 diabetes. This list was compiled from gene-disease databases (Davis et al 2021;Thorn, Klein, and Altman 2013), GWAS catalog from EMBL-EBI (Buniello et al 2019), and two different text-mining methods (Kuusisto et al 2020;Raja et al 2020), resulting in a total of 4,396 genes (Supplementary figure 1). Of these, 3,295 had orthologs identified in the Nile rat assembly annotation by NCBI Orthologs, a part of the NCBI Gene database.…”
Section: Compilation Of Type 2 Diabetes Associated Genesmentioning
confidence: 99%