Proceedings of the International Conference on Bioinformatics, Computational Biology and Biomedical Informatics 2013
DOI: 10.1145/2506583.2506619
|View full text |Cite
|
Sign up to set email alerts
|

Text Mining of Protein Phosphorylation Information Using a Generalizable Rule-Based Approach

Abstract: Literature-based annotation of protein phosphorylation is the focus of many biological databases, as phosphorylation is a global regulator of cellular activity. To speed up manual curation of phosphorylation information, text mining technology has been utilized. In this paper, we report our ongoing effort to enhance RLIMS-P, a rule-based information extraction (IE) system to identify protein phosphorylation information in scientific literature. Despite the high accuracy attained by RLIMS-P, the use of elaborat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
14
0

Year Published

2014
2014
2017
2017

Publication Types

Select...
5

Relationship

4
1

Authors

Journals

citations
Cited by 6 publications
(14 citation statements)
references
References 37 publications
0
14
0
Order By: Relevance
“…As seen in the study by Landeghem et al [27], the selection of the corpus could have a significant impact on the performance of biological event extraction systems, including performance on phosphorylation events. In our prior work [11], we evaluated RLIMS-P 2.0 on the 2011 GE corpus. We further examined the evaluation results and analyzed the patterns pertaining to phosphorylation events in the 2011 GE corpus as detailed in Section 4.2.4.…”
Section: Resultsmentioning
confidence: 99%
See 2 more Smart Citations
“…As seen in the study by Landeghem et al [27], the selection of the corpus could have a significant impact on the performance of biological event extraction systems, including performance on phosphorylation events. In our prior work [11], we evaluated RLIMS-P 2.0 on the 2011 GE corpus. We further examined the evaluation results and analyzed the patterns pertaining to phosphorylation events in the 2011 GE corpus as detailed in Section 4.2.4.…”
Section: Resultsmentioning
confidence: 99%
“…Apart from the low-level text processing implemented, we facilitate mining of phosphorylation information by treating each article section as a separate document, just as the MEDLINE abstract. Writing styles in the full-text articles, however, can be different from those in abstracts [43] and that might affect the performance of RLIMS-P. To evaluate the system in processing full-text articles, we prepared an annotated full-text corpus for our earlier work [11]. We report the development of this corpus, and include the RLIMS-P 2.0 evaluation results on this corpus (Table 2).…”
Section: Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…( 12 ), and the systems that participated in the BioNLP 2011 Shared Task ( 13 ). We chose the RLIMS-P system ( 6 , 7 ) because it has been evaluated with a corpus covering a wide variety of expressions describing phosphorylation events; it extracts information from multiple sentences; and it has recently been improved with new generalizable rules that boost its performance and allow for the possibility of extending to other post-translational modifications.…”
Section: Related Workmentioning
confidence: 99%
“…Addressing the users’ feedback, the main contributions of this work are: (i) full-scale processing of full-length articles from the PubMed Central open-access (PMC OA) database; (ii) the enhancement of the PPI module to include additional words/phrases for PPIs; (iii) the inclusion in the pipeline of iSimp, a sentence simplifier ( 4 , 5 ) to improve the recall when extracting phosphorylation–PPI relations; (iv) the incorporation of the latest version of the RLIMS-P system [RLIMS-P 2.0 ( 6 , 7 )] for phosphorylation event extraction; (v) the enhancement of the website, which allows a user to search for specific kinases, substrates, interacting proteins, keywords or lists of document IDs; (vi) the visualization of the network of interacting proteins via the Cytoscape package ( 8 ); (vii) the inclusion of gene normalization via the GenNorm system ( 9 ); (viii) an evaluation of the eFIP system on full-length articles and (ix) a corpus of annotated data from 100 randomly chosen sections from full-length documents, containing 272 unique annotations.…”
Section: Introductionmentioning
confidence: 99%