2004
DOI: 10.1016/j.jbi.2004.08.006
|View full text |Cite
|
Sign up to set email alerts
|

Enhancing performance of protein and gene name recognizers with filtering and integration strategies

Abstract: Named entity (NE) recognition is a fundamental task in biological relationship mining. This paper considers protein/gene collocates extracted from biological corpora as restrictions to enhance the precision rate of protein/gene name recognition. In addition, we integrate the results of multiple NE recognizers to improve the recall rates. Yapex and KeX, and ABGene and Idgene are taken as examples of protein and gene name recognizers, respectively. The precision of Yapex increases from 70.90 to 85.84% at the low… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
6
0

Year Published

2004
2004
2013
2013

Publication Types

Select...
7
2

Relationship

1
8

Authors

Journals

citations
Cited by 13 publications
(6 citation statements)
references
References 21 publications
0
6
0
Order By: Relevance
“…IE is a method that allows automatic recognition of meaningful words or phrases from unstructured text. A variety of IE methods have been applied to bioinformatics, either in dictionary-based [5] or rule-based approaches [6,7], in applications including detection of disease, protein and gene names [8][9][10][11]. IE methods have also been used for identifying relationships between different terms-for example, protein-protein interactions [12][13][14][15].…”
Section: Introductionmentioning
confidence: 99%
“…IE is a method that allows automatic recognition of meaningful words or phrases from unstructured text. A variety of IE methods have been applied to bioinformatics, either in dictionary-based [5] or rule-based approaches [6,7], in applications including detection of disease, protein and gene names [8][9][10][11]. IE methods have also been used for identifying relationships between different terms-for example, protein-protein interactions [12][13][14][15].…”
Section: Introductionmentioning
confidence: 99%
“…The Max-Ent method that uses a ''dictionary tagger", achieved the best result among the others with F = 57.9% on the University of Texas, Austin dataset which consists of 748 abstracts. Earlier studies [3,17,[24][25][26][27][28][29][30][31] in the IE community have shown that statistical techniques can be of service in performing protein name extraction tasks. Hidden Markov Models (HMMs), one of the successful statistical learning techniques, have been applied to different IE tasks [3,24,27,30].…”
Section: Related Workmentioning
confidence: 99%
“…In recent years, a variety of IE methods have been applied to bioinformatics to detect protein and gene names and construct relationship like protein-protein interaction [3]- [6]. In Wen-Juan et al [7], the goal is to find protein collates extracted from biological corpus to enhance the performance of protein name recognizers. In Barbara et al [8], the task is to classify semantic relations in bioscience text.…”
Section: Introductionmentioning
confidence: 99%