Antimicrobial peptides (AMPs) are at the focus of attention due to their therapeutic importance and developing computational tools for the identification of efficient antibiotics from the primary structure. Here, we utilized the 13CNMR spectral of amino acids and clustered them into various groups. These clusters were used to build feature vectors for the AMP sequences based on the composition, transition, and distribution of cluster members. These features, along with the physicochemical properties of AMPs were exploited to learn computational models to predict active AMPs solely from their sequences. Naïve Bayes (NB), k-nearest neighbors (KNN), support-vector machine (SVM), random forest (RF), and eXtreme Gradient Boosting (XGBoost) were employed to build the classification system using the collected AMP datasets from the CAMP, LAMP, ADAM, and AntiBP databases. Our results were validated and compared with the CAMP and ADAM prediction systems and indicated that the synergistic combination of the 13CNMR features with the physicochemical descriptors enables the proposed ensemble mechanism to improve the prediction performance of active AMP sequences. Our web-based AMP prediction platform, IAMPE, is available at .
Biofilms are biological systems that are formed by a community of microorganisms in which microbial cells are connected on a surface within a self-produced matrix of an extracellular polymeric substance. On some occasions, microorganisms use biofilms to protect themselves against the harmful effects of the host body immune system and the surrounding environment, hence increasing their chances of survival against the various anti-microbial agents. Biofilms play a crucial role in medicine and industry because of the problems they cause. Designing agents that inhibit bacterial biofilm formation is very costly and takes too much time in the laboratory to be discovered and validated. Therefore, developing computational tools for the prediction of biofilm inhibitor peptides is inevitable and important. Here, we present a computational prediction tool to screen the vast number of peptide sequences and select potential candidate peptides for further lab experiments and validation. In this learning model, different feature vectors, extracted from the peptide primary structure, are exploited to learn patterns from the sequence of biofilm inhibitory peptides. Various classification algorithms including SVM, random forest, and k-nearest neighbor have been examined to evaluate their performance. Overall, our approach showed better prediction in comparison with other prediction methods. In this study, for the first time, we applied features extracted from NMR spectra of amino acids along with physicochemical features. Although each group of features showed good discrimination potential alone, we used a combination of features to enhance the performance of our method. Our prediction tool is freely available.
Cell-penetrating anticancer peptides (Cp-ACPs) are considered promising candidates in solid tumor and hematologic cancer therapies. Current approaches for the design and discovery of Cp-ACPs trust the expensive high-throughput screenings that often give rise to multiple obstacles, including instrumentation adaptation and experimental handling. The application of machine learning (ML) tools developed for peptide activity prediction is importantly of growing interest. In this study, we applied the random forest (RF)-, support vector machine (SVM)-, and eXtreme gradient boosting (XGBoost)-based algorithms to predict the active Cp-ACPs using an experimentally validated data set. The model, CpACpP, was developed on the basis of two independent cell-penetrating peptide (CPP) and anticancer peptide (ACP) subpredictors. Various compositional and physiochemical-based features were combined or selected using the multilayered recursive feature elimination (RFE) method for both data sets. Our results showed that the ACP subclassifiers obtain a mean performance accuracy (ACC) of 0.98 with an area under curve (AUC) ≈ 0.98 vis-à-vis the CPP predictors displaying relevant values of ∼0.94 and ∼0.95 via the hybrid-based features and independent data sets, respectively. Also, the predicting evaluation of Cp-ACPs gave accuracies of ∼0.79 and 0.89 on a series of independent sequences by applying our CPP and ACP classifiers, respectively, which leaves the performance of our predictors better than the earlier reported ACPred, mACPpred, MLCPP, and CPPred-RF. The described consensus-based fusion method additionally reached an AUC of 0.94 for the prediction of Cp-ACP ().
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.