2018
DOI: 10.1142/s0219720018500142
|View full text |Cite
|
Sign up to set email alerts
|

Training host-pathogen protein–protein interaction predictors

Abstract: Detection of protein-protein interactions (PPIs) plays a vital role in molecular biology. Particularly, pathogenic infections are caused by interactions of host and pathogen proteins. It is important to identify host-pathogen interactions (HPIs) to discover new drugs to counter infectious diseases. Conventional wet lab PPI detection techniques have limitations in terms of cost and large-scale application. Hence, computational approaches are developed to predict PPIs. This study aims to develop machine learning… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
20
0

Year Published

2018
2018
2024
2024

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 25 publications
(20 citation statements)
references
References 36 publications
0
20
0
Order By: Relevance
“…XGBoost was successfully applied in hundreds of recent studies to predict, e.g. host-pathogen protein–protein interactions (16), microRNA disease association (17) and DNA methylation (18). Several studies including our own previous paper showed that XGBoost gives the best performance if compared with a number of known machine learning methods (see e.g.…”
Section: Description Of the Databasementioning
confidence: 99%
See 1 more Smart Citation
“…XGBoost was successfully applied in hundreds of recent studies to predict, e.g. host-pathogen protein–protein interactions (16), microRNA disease association (17) and DNA methylation (18). Several studies including our own previous paper showed that XGBoost gives the best performance if compared with a number of known machine learning methods (see e.g.…”
Section: Description Of the Databasementioning
confidence: 99%
“…Following the methodology of several XGBoost studies (11,16–18) including our previously published work (12) we evaluated the XGBoost-selected feature sets by 5-fold cross-validation, and we evaluated their predictive power by the area under the curve of the receiver operating characteristic curve (ROC AUC or shortly AUC, 21). 5-fold cross-validation is a widely used method where the training data is split into five random parts and four parts are used to train the XGBoost machine learning tool and the prediction of the fifth part is evaluated.…”
Section: Description Of the Databasementioning
confidence: 99%
“…Virus protein sequences of different species share only little in common (Eid et al, 2016). Therefore, models trained for other human PPI (Li & Ilie, 2020; Sun et al, 2017; Li, 2020; Chen et al, 2019; Sarkar & Saha, 2019) or for other pathogen-human PPI (Sudhakar et al, 2020; Mei & Zhang, 2020; Dick et al, 2020; Li et al, 2014; Guven-Maiorov et al, 2019; Basit et al, 2018)(for which more data might be available) cannot be directly used for predictions for novel viral-human protein interactions.…”
Section: Introductionmentioning
confidence: 99%
“…Nourani used Naïve Bayes [ 23 ]. Decision tree related classifier was the approach taken by Basit [ 24 ]. Random forest was the approach followed by both Yang [ 25 ] and Barman [ 26 ].…”
Section: Introductionmentioning
confidence: 99%