2018
DOI: 10.1186/s12911-018-0628-4
|View full text |Cite
|
Sign up to set email alerts
|

Automatic extraction of protein-protein interactions using grammatical relationship graph

Abstract: BackgroundRelationships between bio-entities (genes, proteins, diseases, etc.) constitute a significant part of our knowledge. Most of this information is documented as unstructured text in different forms, such as books, articles and on-line pages. Automatic extraction of such information and storing it in structured form could help researchers more easily access such information and also make it possible to incorporate it in advanced integrative analysis. In this study, we developed a novel approach to extra… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
17
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
6
2

Relationship

1
7

Authors

Journals

citations
Cited by 27 publications
(17 citation statements)
references
References 77 publications
0
17
0
Order By: Relevance
“…00043 useful information from unstructured narrative text. 4,5 NLP algorithms have been widely applied to medical research in recent years for various tasks, including extracting protein-protein interactions, 6 predicting gene-disease associations from biomedical literature databases, 7 improving the sensitivity of screening for suicide behaviors among pregnant women from electronic health record systems, 8 and correlating mammographic imaging features with pathologic findings. 9 Our group previously used an NLP algorithm to automatically parse breast pathologic reports in both English and Chinese.…”
Section: Introductionmentioning
confidence: 99%
“…00043 useful information from unstructured narrative text. 4,5 NLP algorithms have been widely applied to medical research in recent years for various tasks, including extracting protein-protein interactions, 6 predicting gene-disease associations from biomedical literature databases, 7 improving the sensitivity of screening for suicide behaviors among pregnant women from electronic health record systems, 8 and correlating mammographic imaging features with pathologic findings. 9 Our group previously used an NLP algorithm to automatically parse breast pathologic reports in both English and Chinese.…”
Section: Introductionmentioning
confidence: 99%
“…However, we are mainly interested in the precision metric, motivated by a real-world application of the system. We compared our method to existing methods in literature, including rule-based approaches [19], [20], featureand kernel-based approaches [21]- [26], and neural network approaches [27]- [31]. Additionally, we compared our system to recent transformer-based methods pre-trained on biomedical texts, namely BioBERT [34] and SciBERT [35].…”
Section: Resultsmentioning
confidence: 99%
“…Fundel et al [19] showed how a small number of carefully designed rules based on the shortest dependency path (SDP) between two examined entities produces fairly good results. Yu et al [20] exploited dependency parse trees and a flexible pattern matching scheme, enriching the system with a decision tree classifier. Diverse syntactic and orthography features have been extensively used in feature-and kernel-based methods.…”
Section: Related Workmentioning
confidence: 99%
“…These sentences were then dependency parsed. A protein-protein extraction system previously developed was then applied based on features extracted during the previous steps [ 28 , 54 , 55 ]. Finally, features were extracted from preceding steps in the pipeline and were used as input for training an XGBoost classifier [ 56 ].…”
Section: Methodsmentioning
confidence: 99%
“…It was constructed based on the personal knowledge of the authors and manual review of sentences in the literature known to contain protein-protein interactions [ 21 ]. This dictionary has been successfully employed in similar applications [ 7 , 28 , 61 – 64 ].…”
Section: Methodsmentioning
confidence: 99%