2013
DOI: 10.1093/bioinformatics/btt474
|View full text |Cite
|
Sign up to set email alerts
|

DNorm: disease name normalization with pairwise learning to rank

Abstract: Motivation: Despite the central role of diseases in biomedical research, there have been much fewer attempts to automatically determine which diseases are mentioned in a text—the task of disease name normalization (DNorm)—compared with other normalization tasks in biomedical text mining research.Methods: In this article we introduce the first machine learning approach for DNorm, using the NCBI disease corpus and the MEDIC vocabulary, which combines MeSH® and OMIM. Our method is a high-performing and mathematic… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
405
1
2

Year Published

2014
2014
2020
2020

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 436 publications
(411 citation statements)
references
References 31 publications
3
405
1
2
Order By: Relevance
“…We have shown that our approach can be easily applied to different domains by merely exchanging the underlying ontology and training data. On the task of recognizing and linking disease names, we show that our approach outperforms the state-of-the-art systems DNorm [12] and TaggerOne [11], as well as two lexicon-based baselines. On the task of recognizing and linking chemical names, our system achieves comparable performance to the state-of-the-art.…”
Section: Resultsmentioning
confidence: 89%
See 3 more Smart Citations
“…We have shown that our approach can be easily applied to different domains by merely exchanging the underlying ontology and training data. On the task of recognizing and linking disease names, we show that our approach outperforms the state-of-the-art systems DNorm [12] and TaggerOne [11], as well as two lexicon-based baselines. On the task of recognizing and linking chemical names, our system achieves comparable performance to the state-of-the-art.…”
Section: Resultsmentioning
confidence: 89%
“…We apply the same model to both problems, only exchanging the underlying reference knowledge base. With an F 1 score of 85.9 in disease linking, we outperform the state-of-the-art systems DNorm [12] and TaggerOne [11]; in chemical compounds linking, our system achieves an F 1 score of 86.6, which is comparable to the stateof-the-art. Thus, J-Link provides high performance on both domains without major need of manual adaptation or system tuning.…”
Section: Introductionmentioning
confidence: 79%
See 2 more Smart Citations
“…For example, in the results section of one article (PMCID: PMC3910500), it described the genomic landscape of glioblastoma using the wholeexome (WES), whole-genome sequencing (WGS), and RNA-Sequencing (RNA) (Brennan et al, 2013). To identify the TCGA cancer type and high-throughput platform concept from the free texts, we developed a named entity recognition method that is based on a biomedical text mining tool (Leaman, Islamaj, & Lu, 2013).…”
Section: Tcga Data Usage Analysismentioning
confidence: 99%