2005
DOI: 10.1186/1471-2105-6-s1-s14
|View full text |Cite
|
Sign up to set email alerts
|

ProMiner: rule-based protein and gene entity recognition

Abstract: Background: Identification of gene and protein names in biomedical text is a challenging task as the corresponding nomenclature has evolved over time. This has led to multiple synonyms for individual genes and proteins, as well as names that may be ambiguous with other gene names or with general English words. The Gene List Task of the BioCreAtIvE challenge evaluation enables comparison of systems addressing the problem of protein and gene name identification on common benchmark data.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
176
0

Year Published

2007
2007
2019
2019

Publication Types

Select...
4
3
2
1

Relationship

2
8

Authors

Journals

citations
Cited by 273 publications
(176 citation statements)
references
References 17 publications
0
176
0
Order By: Relevance
“…A strong focus in biomedical information extraction has long been on named entity recognition, for which machine-learning solutions such as conditional random fields (Lafferty et al, 2001) or dictionary-based systems (Schuemie et al, 2007;Hanisch et al, 2005;Hakenberg et al, 2011) are available which tackle the respective problem with decent performance and for specific entity classes such as organisms (Pafilis et al, 2013) or symptoms (Savova et al, 2010;Jimeno et al, 2008). A detailed overview on named entity recognition, covering other domains as well, can be found in Nadeau and Sekine (2007).…”
Section: Related Workmentioning
confidence: 99%
“…A strong focus in biomedical information extraction has long been on named entity recognition, for which machine-learning solutions such as conditional random fields (Lafferty et al, 2001) or dictionary-based systems (Schuemie et al, 2007;Hanisch et al, 2005;Hakenberg et al, 2011) are available which tackle the respective problem with decent performance and for specific entity classes such as organisms (Pafilis et al, 2013) or symptoms (Savova et al, 2010;Jimeno et al, 2008). A detailed overview on named entity recognition, covering other domains as well, can be found in Nadeau and Sekine (2007).…”
Section: Related Workmentioning
confidence: 99%
“…MEDLINE (Medical Literature Analysis and Retrieval System Online) is a bibliographic database maintained by the National Center for Biotechnology Information and covers a large number of scientific publications from medicine, psychology, and the health system. For the clustering use case, we study MEDLINE abstracts and associated metadata that are processed by ProMiner, a named entity recognition system ( [19]), and indexed by the semantic information retrieval platform SCAIView ( [20]). SCAIView also offers an API that allows programmatic access to the data.…”
Section: Document Clustering On Medlinementioning
confidence: 99%
“…It extracted material names in the sentence with 94.70% precision and 98.84% recall. D Hanisch et al [20] constructed the ProMiner system by a rule-based approach and a pre-processed synonym dictionary, to identify potential name occurrences in the bio-medical text and associate protein and gene database identifiers with the detected matches. In blind predictions, the system achieved an F-measure of approximately 0.8 for the organisms mouse and fly and about 0.9 for the organism yeast.…”
Section: The Rule-based Ner Approachmentioning
confidence: 99%