2020
DOI: 10.1016/j.bj.2020.08.003
|View full text |Cite
|
Sign up to set email alerts
|

Machine learning techniques for sequence-based prediction of viral–host interactions between SARS-CoV-2 and human proteins

Abstract: Background COVID-19 (Coronavirus Disease-19), a disease caused by the SARS-CoV-2 virus, has been declared as a pandemic by the World Health Organization on March 11, 2020. Over 15 million people have already been affected worldwide by COVID-19, resulting in more than 0.6 million deaths. Protein–protein interactions (PPIs) play a key role in the cellular process of SARS-CoV-2 virus infection in the human body. Recently a study has reported some SARS-CoV-2 proteins that interact with several human p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
64
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2
2
1

Relationship

0
8

Authors

Journals

citations
Cited by 89 publications
(64 citation statements)
references
References 43 publications
0
64
0
Order By: Relevance
“…Pairs of the human and virus proteins that do not appear in the positive PPI dataset are randomly sampled as negative data. However, the random sampling method may incorrectly assign many positive samples to negative ones [5, 20]. To address this problem, the dissimilarity negative sampling method was developed [5], which used a sequence similarity-based method to explore the protein pairs that are unlikely to interact.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Pairs of the human and virus proteins that do not appear in the positive PPI dataset are randomly sampled as negative data. However, the random sampling method may incorrectly assign many positive samples to negative ones [5, 20]. To address this problem, the dissimilarity negative sampling method was developed [5], which used a sequence similarity-based method to explore the protein pairs that are unlikely to interact.…”
Section: Methodsmentioning
confidence: 99%
“…DeepViral also encoded host phenotype associations from PathoPhenoDB [18] and protein functions from the Gene Ontology (GO) database (The Gene Ontology Consortium, 2017) to predict PPIs. In addition, several constructed ML models were designed for certain individual virus species, limiting their generalizability to other human host-virus systems [19][20][21].…”
Section: Introductionmentioning
confidence: 99%
“…In this approach, the intra-race training of [25] using a combination of human PPI and Bacillus Anthracis data of different species was done for inter-race forecasting, resulting in a binary classifier that predicted traces of Bacillus Anthracis in human with moderate accuracy of 89.0%. Recently, Dey et al [97] employed various ML models, including SVM, to predict interactions between SARS-CoV2 and human protein pairs, wherein the SVM method performed adequately for RBF and polynomial kernels, with the accuracy of 69.67% and 68.03%, respectively. However, [97] proposed an ensemble technique that outperformed the other models with an accuracy of 72.33%.…”
Section: A Research Directionmentioning
confidence: 99%
“…Recently, Dey et al [97] employed various ML models, including SVM, to predict interactions between SARS-CoV2 and human protein pairs, wherein the SVM method performed adequately for RBF and polynomial kernels, with the accuracy of 69.67% and 68.03%, respectively. However, [97] proposed an ensemble technique that outperformed the other models with an accuracy of 72.33%. Lastly, we present a summary of the contributions and limitations of the publications reviewed, listed in Table 17.…”
Section: A Research Directionmentioning
confidence: 99%
“…Molecular-dynamics simulations have massively extended the utility of experimental structural data, providing novel insights via the rapid analysis of in silico mutagenesis for many of the key viral proteins Rynkiewicz et al, 2021;Sheik Amamuddy et al, 2020) and the ability to model glycosylation of the spike protein Woo et al, 2020). Computational approaches have also helped to probe potential host-pathogen protein interactions (HPIs), contributing network-based and machine-learning-based assessments of putative interactions leveraging multiple HPI databases (Dey et al, 2020;Messina et al, 2020). One ambitious project, Folding@Home, has even employed crowdsourcing to access exascale computing resources for multiple SARS-CoV-2 simulation projects (Achdout et al, 2020;Zimmerman & Bowman, 2021).…”
Section: Computational Approaches For Sas-cov-2 Proteinsmentioning
confidence: 99%