Machine Learning Advances in Predicting Peptide/Protein‐Protein Interactions Based on Sequence Information for Lead Peptides Discovery

Ye, Jiahao; An, Li; Zheng, Hao; Yang, Banghua; Lu, Yiming

doi:10.1002/adbi.202200232

Cited by 10 publications

(2 citation statements)

References 138 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For this reason, researchers have increasingly turned to machine learning approaches to predict interactions, with the goal of accelerating the screening process [13][14][15][16][17]. In a recent review by Ye et al [18] these methods have been categorized into five main groups: Linear-based, including linear regression and logistic regression; Tree-based, including decision tree, random forest, and gradient boosting machines; Kernel-based, including radial basis function, linear discriminate analysis, and support vector machine; Neuralnetwork-based, including convolutional neural-networks, recurrent neural networks and generative adversarial networks; and Attention-mechanism-based, which includes transformers and BERT.…”

Section: Introductionmentioning

confidence: 99%

Accelerating the Screening of Small Peptide Ligands by Combining Peptide-Protein Docking and Machine Learning

Codina,

Mascini,

Dikici

et al. 2023

IJMS

View full text Add to dashboard Cite

This research introduces a novel pipeline that couples machine learning (ML), and molecular docking for accelerating the process of small peptide ligand screening through the prediction of peptide-protein docking. Eight ML algorithms were analyzed for their potential. Notably, Light Gradient Boosting Machine (LightGBM), despite having comparable F1-score and accuracy to its counterparts, showcased superior computational efficiency. LightGBM was used to classify peptide-protein docking performance of the entire tetrapeptide library of 160,000 peptide ligands against four viral envelope proteins. The library was classified into two groups, ‘better performers’ and ‘worse performers’. By training the LightGBM algorithm on just 1% of the tetrapeptide library, we successfully classified the remaining 99%with an accuracy range of 0.81–0.85 and an F1-score between 0.58–0.67. Three different molecular docking software were used to prove that the process is not software dependent. With an adjustable probability threshold (from 0.5 to 0.95), the process could be accelerated by a factor of at least 10-fold and still get 90–95% concurrence with the method without ML. This study validates the efficiency of machine learning coupled to molecular docking in rapidly identifying top peptides without relying on high-performance computing power, making it an effective tool for screening potential bioactive compounds.

show abstract

Section: Introductionmentioning

confidence: 99%

Accelerating the Screening of Small Peptide Ligands by Combining Peptide-Protein Docking and Machine Learning

Codina,

Mascini,

Dikici

et al. 2023

IJMS

View full text Add to dashboard Cite

show abstract

“…As a cheminformatics model, ML combines chemistry, computer science, and information technology to aid in drug discovery through tasks like virtual screening, library design, and high-throughput screening analysis [10][11][12]. Machine learning algorithms leverage large chemical datasets for predictive modeling and pattern recognition, including the prediction of the properties and activities of peptides based on their sidechains [13][14][15][16]. This integration has accelerated the discovery and design of novel peptides with desired biological activities, opening new avenues for peptide-based drug development.…”

Section: Introductionmentioning

confidence: 99%

Prospection of Peptide Inhibitors of Thrombin from Diverse Origins Using a Machine Learning Pipeline

Balakrishnan,

Katkar,

Pham

et al. 2023

Bioengineering

View full text Add to dashboard Cite

Thrombin is a key enzyme involved in the development and progression of many cardiovascular diseases. Direct thrombin inhibitors (DTIs), with their minimum off-target effects and immediacy of action, have greatly improved the treatment of these diseases. However, the risk of bleeding, pharmacokinetic issues, and thrombotic complications remain major concerns. In an effort to increase the effectiveness of the DTI discovery pipeline, we developed a two-stage machine learning pipeline to identify and rank peptide sequences based on their effective thrombin inhibitory potential. The positive dataset for our model consisted of thrombin inhibitor peptides and their binding affinities (KI) curated from published literature, and the negative dataset consisted of peptides with no known thrombin inhibitory or related activity. The first stage of the model identified thrombin inhibitory sequences with Matthew’s Correlation Coefficient (MCC) of 83.6%. The second stage of the model, which covers an eight-order of magnitude range in KI values, predicted the binding affinity of new sequences with a log room mean square error (RMSE) of 1.114. These models also revealed physicochemical and structural characteristics that are hidden but unique to thrombin inhibitor peptides. Using the model, we classified more than 10 million peptides from diverse sources and identified unique short peptide sequences (<15 aa) of interest, based on their predicted KI. Based on the binding energies of the interaction of the peptide with thrombin, we identified a promising set of putative DTI candidates. The prediction pipeline is available on a web server.

show abstract

AI/ML Approaches in Drug Design

Kırboğa

2024

Computational Methods for Rational Drug Design

View full text Add to dashboard Cite

Machine Learning Advances in Predicting Peptide/Protein‐Protein Interactions Based on Sequence Information for Lead Peptides Discovery

Cited by 10 publications

References 138 publications

Accelerating the Screening of Small Peptide Ligands by Combining Peptide-Protein Docking and Machine Learning

Accelerating the Screening of Small Peptide Ligands by Combining Peptide-Protein Docking and Machine Learning

Prospection of Peptide Inhibitors of Thrombin from Diverse Origins Using a Machine Learning Pipeline

AI/ML Approaches in Drug Design

Contact Info

Product

Resources

About