2013
DOI: 10.1371/journal.pone.0072965
|View full text |Cite
|
Sign up to set email alerts
|

Learning to Recognize Phenotype Candidates in the Auto-Immune Literature Using SVM Re-Ranking

Abstract: The identification of phenotype descriptions in the scientific literature, case reports and patient records is a rewarding task for bio-medical text mining. Any progress will support knowledge discovery and linkage to other resources. However because of their wide variation a number of challenges still remain in terms of their identification and semantic normalisation before they can be fully exploited for research purposes.This paper presents novel techniques for identifying potential complex phenotype mentio… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
5
0

Year Published

2014
2014
2017
2017

Publication Types

Select...
3
3

Relationship

2
4

Authors

Journals

citations
Cited by 10 publications
(5 citation statements)
references
References 46 publications
0
5
0
Order By: Relevance
“…The principal modules are now briefly discussed. Data sampling: As described in Data Sampling section; Data cleansing: split and tokenize the sentences using the GENIA tagger ( 15 ) trained on the GENIA Medline abstract corpus; Parsing: phrase structure parsing takes place using the BLLIP/Charniak-Johnson parser (available from https://github.com/BLLIP/bllip-parser ); ( 16 ) trained on the GENIA corpus as labelled data and PubMed; Named entity recognition: biomedical entities were tagged using thePM NER tagger ( 17 ) and the GENIA tagger. This allows us to include semantic labels about anatomical entities, disorders, genes, proteins and other entities that might not be matched in the external vocabularies of the NCBO Annotator.…”
Section: System and Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The principal modules are now briefly discussed. Data sampling: As described in Data Sampling section; Data cleansing: split and tokenize the sentences using the GENIA tagger ( 15 ) trained on the GENIA Medline abstract corpus; Parsing: phrase structure parsing takes place using the BLLIP/Charniak-Johnson parser (available from https://github.com/BLLIP/bllip-parser ); ( 16 ) trained on the GENIA corpus as labelled data and PubMed; Named entity recognition: biomedical entities were tagged using thePM NER tagger ( 17 ) and the GENIA tagger. This allows us to include semantic labels about anatomical entities, disorders, genes, proteins and other entities that might not be matched in the external vocabularies of the NCBO Annotator.…”
Section: System and Methodsmentioning
confidence: 99%
“…Named entity recognition: biomedical entities were tagged using thePM NER tagger ( 17 ) and the GENIA tagger. This allows us to include semantic labels about anatomical entities, disorders, genes, proteins and other entities that might not be matched in the external vocabularies of the NCBO Annotator.…”
Section: System and Methodsmentioning
confidence: 99%
“…Using the scientific literature as a source, several groups have been active in developing approaches explicitly for phenotypes. These include the Bio-LarK system, which has been applied to skeletal dysplasia [ 39 ] and the PhenoMiner system, which has been applied to the cardiovascular and autoimmune systems [ 60 ]. Work by Khordad et al.…”
Section: State-of-the-art Phenome Researchmentioning
confidence: 99%
“…This method is widely used in life engineering [3][4][5], the Internet [6], civil engineering [7], geo-science [8,9], mechanical engineering [10][11][12] and medical science [13]. SVM is well-suited for addressing pattern recognition problems with small samples, nonlinearity, and high dimension, among others.…”
Section: Introductionmentioning
confidence: 99%