2005
DOI: 10.1093/bioinformatics/bti390
|View full text |Cite
|
Sign up to set email alerts
|

Literature mining and database annotation of protein phosphorylation using a rule-based system

Abstract: A rule-based system, RLIMS-P (Rule-based LIterature Mining System for Protein Phosphorylation), was used to extract protein phosphorylation information from MEDLINE abstracts. An annotation-tagged literature corpus developed at PIR was used to evaluate the system for finding phosphorylation papers and extracting phosphorylation objects (kinases, substrates and sites) from abstracts. RLIMS-P achieved a precision and recall of 91.4 and 96.4% for paper retrieval, and of 97.9 and 88.0% for extraction of substrates… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
72
0
1

Year Published

2007
2007
2023
2023

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 104 publications
(73 citation statements)
references
References 19 publications
0
72
0
1
Order By: Relevance
“…In addition, in order to increase the accuracy of identifying protein names, name substitution is used for processing sentences with complex and long protein names by following existing studies. 22 In detail, we employed semantic tokenization to replace multi-word protein names with single token. An abbreviated notation PROi (i is the index of the corresponding protein name in the dictionary) is used to replace protein names in each sentence.…”
Section: Named Entity Recognition (Ner) and Substitutionmentioning
confidence: 99%
See 2 more Smart Citations
“…In addition, in order to increase the accuracy of identifying protein names, name substitution is used for processing sentences with complex and long protein names by following existing studies. 22 In detail, we employed semantic tokenization to replace multi-word protein names with single token. An abbreviated notation PROi (i is the index of the corresponding protein name in the dictionary) is used to replace protein names in each sentence.…”
Section: Named Entity Recognition (Ner) and Substitutionmentioning
confidence: 99%
“…Other text-mining advances include biological relationship extraction such as gene-disease interactions, 18 named entity recognition (gene/protein name, 19 organisms, 20 and diseases, 21 etc.). Current web-based text mining techniques for PTM information extraction include RLIMS-P, 22 RLIMS-P 2.0, 23 eFIP 24 and MinePhos, 25 RLIMS-P 22 is the¯rst tool for PTM information extraction. It is a rule-based system which utilizes shallow parsing technique and manually developed patterns to extract phosphorylation information (substrates, kinases and sites) from abstracts.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In particular, the NLP based approaches have been studied extensively and achieve great success in many information extraction tasks [7], [24], [25], [26], [27]. Some recent works in NLP include RLIMS-P [8], MDL-based method [28], and BioIE [29].…”
Section: Related Workmentioning
confidence: 99%
“…We believe that both these approaches have limitations, which restrict their ability to provide the kind of information biologists desire. The first approach misses a lot of useful information because of the simple representation of a document, whereas the second approach is likely to be only semi-automatic [8].…”
Section: Introductionmentioning
confidence: 99%