2014
DOI: 10.1093/database/bau038
|View full text |Cite
|
Sign up to set email alerts
|

iSimp in BioC standard format: enhancing the interoperability of a sentence simplification system

Abstract: This article reports the use of the BioC standard format in our sentence simplification system, iSimp, and demonstrates its general utility. iSimp is designed to simplify complex sentences commonly found in the biomedical text, and has been shown to improve existing text mining applications that rely on the analysis of sentence structures. By adopting the BioC format, we aim to make iSimp readily interoperable with other applications in the biomedical domain. To examine the utility of iSimp in BioC, we impleme… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2015
2015
2023
2023

Publication Types

Select...
5

Relationship

1
4

Authors

Journals

citations
Cited by 9 publications
(4 citation statements)
references
References 8 publications
0
4
0
Order By: Relevance
“…The impact module was described in detail in a previous publication of the eFIP system ( 2 ). Here, we will briefly summarize with examples, and concentrate on the addition of iSimp ( 4 , 5 ) for sentence simplification.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…The impact module was described in detail in a previous publication of the eFIP system ( 2 ). Here, we will briefly summarize with examples, and concentrate on the addition of iSimp ( 4 , 5 ) for sentence simplification.…”
Section: Methodsmentioning
confidence: 99%
“…Addressing the users’ feedback, the main contributions of this work are: (i) full-scale processing of full-length articles from the PubMed Central open-access (PMC OA) database; (ii) the enhancement of the PPI module to include additional words/phrases for PPIs; (iii) the inclusion in the pipeline of iSimp, a sentence simplifier ( 4 , 5 ) to improve the recall when extracting phosphorylation–PPI relations; (iv) the incorporation of the latest version of the RLIMS-P system [RLIMS-P 2.0 ( 6 , 7 )] for phosphorylation event extraction; (v) the enhancement of the website, which allows a user to search for specific kinases, substrates, interacting proteins, keywords or lists of document IDs; (vi) the visualization of the network of interacting proteins via the Cytoscape package ( 8 ); (vii) the inclusion of gene normalization via the GenNorm system ( 9 ); (viii) an evaluation of the eFIP system on full-length articles and (ix) a corpus of annotated data from 100 randomly chosen sections from full-length documents, containing 272 unique annotations.…”
Section: Introductionmentioning
confidence: 99%
“…We first downloaded the baseline packages and then tracked the most recent updates of PubMed and PMC on a daily basis. All text in Extensible Markup Language format was transformed to BioC-JSON format, a community-driven biomedical text processing data format for improved interoperability, to simplify future data processing and exchange (36). Since sentences have a higher level of localization and information density than paragraphs, they are more likely to be relevant if they contain multiple bioentities (29).…”
Section: Text Miningmentioning
confidence: 99%
“…Those techniques have been effective for detecting protein–protein interactions from text , or drug resistance information . bioSimplify and iSimp represent two popular sentence simplification approaches applied to biomedical literature, while the Cafetiere Sentence Splitter has been tested on chemical abstracts. , …”
Section: Chemical Information Retrievalmentioning
confidence: 99%