Proceedings of the ACL 2003 Workshop on Natural Language Processing in Biomedicine - 2003
DOI: 10.3115/1118958.1118963
|View full text |Cite
|
Sign up to set email alerts
|

Two-phase biomedical NE recognition based on SVMs

Abstract: Using SVMs for named entity recognition, we are often confronted with the multi-class problem. Larger as the number of classes is, more severe the multiclass problem is. Especially, one-vs-rest method is apt to drop the performance by generating severe unbalanced class distribution. In this study, to tackle the problem, we take a two-phase named entity recognition method based on SVMs and dictionary; at the first phase, we try to identify each entity by a SVM classifier and post-process the identified entities… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
60
0

Year Published

2004
2004
2020
2020

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 74 publications
(60 citation statements)
references
References 8 publications
0
60
0
Order By: Relevance
“…In-context features dictionary is similar to the work of Lee et al [6]. The most right 3 words from name in the training data are collected as candidates.…”
Section: Construction Of Dictionariesmentioning
confidence: 99%
See 1 more Smart Citation
“…In-context features dictionary is similar to the work of Lee et al [6]. The most right 3 words from name in the training data are collected as candidates.…”
Section: Construction Of Dictionariesmentioning
confidence: 99%
“…Nowadays, Named Entity Recognition (NER) is proved to be fundamental in information extraction and understanding in biomedical domain. Based on the method, the NER system can be roughly split into three categorizes: rule-based methods [1][2], dictionary-based methods [3], and statistical-based methods [4][5][6][7], although there are also combination of dictionary-based and rule-based method [8].…”
Section: Introductionmentioning
confidence: 99%
“…General medical term was trained with UMLS meta-thesaurus [12] and the biological entity and its interaction was trained with GENIA [13] corpus. The underlying NLP approaches for named entity recognition are based on the system of Hwang et al [14] and Lee et al [15] with collaborations. More detailed descriptions of language processing are elucidated in [16].…”
Section: Interaction Extractionmentioning
confidence: 99%
“…); and in machine translation or cross-language information retrieval, special transliteration processes can be applied to entity names across languages with different alphabets, 4 provided that the names have been identified. Although ners that employ mostly hand-crafted rules 5,6 may perform very well, ners that use statistical and machine learning techniques, including Hidden Markov or Maximum Entropy Models, 7,8,9,10 decision tree learning and/or boosting, 11,12,13 and Support Vector Machines, 14,15 usually outperform them and they are easier to port to new text genres (e.g., biomedical, instead of news articles), where new name categories (e.g., protein names) may also need to be supported. However, supervised statistical and machine learning-based ners still require a tedious manual annotation phase, during which humans must tag occurrences of entity names in a training corpus.…”
Section: Introductionmentioning
confidence: 99%
“…Our two passes are also different from the approach whereby a first phase identifies all entity names and a second one categorizes them. 15 Furthermore, unlike the system of Shen et al, our ensemble acts as a single classifier with non-overlapping categories. In active learning, we select training examples for each pass by considering the distances from the hyperplanes of both svms of that pass, much as in Vlachos.…”
mentioning
confidence: 99%