2014
DOI: 10.1093/database/bau003
|View full text |Cite
|
Sign up to set email alerts
|

Literature mining of genetic variants for curation: quantifying the importance of supplementary material

Abstract: A major focus of modern biological research is the understanding of how genomic variation relates to disease. Although there are significant ongoing efforts to capture this understanding in curated resources, much of the information remains locked in unstructured sources, in particular, the scientific literature. Thus, there have been several text mining systems developed to target extraction of mutations and other genetic variation from the literature. We have performed the first study of the use of text mini… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
31
0

Year Published

2014
2014
2023
2023

Publication Types

Select...
5
3
1
1

Relationship

0
10

Authors

Journals

citations
Cited by 32 publications
(34 citation statements)
references
References 32 publications
3
31
0
Order By: Relevance
“…Natural language processing (NLP) approaches attempt to extract this knowledge in the form of structured concepts and relationships such that it can be used for a variety of computational tasks. Just a few of many examples include identifying functional genetic variants [1], identifying biomarkers and phenotypes related to disease [2], and drug repositioning [3]. …”
Section: Introductionmentioning
confidence: 99%
“…Natural language processing (NLP) approaches attempt to extract this knowledge in the form of structured concepts and relationships such that it can be used for a variety of computational tasks. Just a few of many examples include identifying functional genetic variants [1], identifying biomarkers and phenotypes related to disease [2], and drug repositioning [3]. …”
Section: Introductionmentioning
confidence: 99%
“…COSMIC focuses on somatic mutations while InSiGHT collects germline mutations related to Lynch Syndrome for just four genes. In addition, some of the extracted mentions are not functional or significant for the disease, as previously described 11 . For instance in PMID:10469011, the mutation Ala140Thr is extracted but the article states this mutation … is known to be functionally silent , and hence was excluded from the database.…”
Section: Discussionmentioning
confidence: 95%
“…This approach is clearly tied to the availability of abstract-level details about an SNP existing in PubMed; this data is very sparse, in part due to the fact that specific details about genetic variants are only rarely available in published abstracts (Jimeno Yepes & Verspoor, 2014a). Unfortunately, the access to the full content of the article is limited, with only around 600k articles available from PubMed Central (a small proportion of the over 23M citations available from PubMed).…”
Section: Methodsmentioning
confidence: 99%