Chemical coordination of gene expression among bacteria as a function of population density is regulated by a mechanism known as 'quorum sensing' (QS). QS in Pseudomonas aeruginosa, an opportunistic pathogen that causes disease in immunocompromised patients, is mediated by binding of the transcriptional activator, LasR, to its ligand, 3-oxo-C(12)-HSL, leading to population-wide secretion of virulence factors and biofilm formation. We have targeted QS in P. aeruginosa with a set of electrophilic probes designed to covalently bind Cys79 in the LasR binding pocket, leading to specific inhibition of QS-regulated gene expression and concomitant reduction of virulence factor secretion and biofilm formation. This first example of covalent modification of a QS receptor provides a new tool to study molecular mechanisms of bacterial group behavior and could lead to new strategies for targeting bacterial virulence.
How well do different classification methods perform in selecting the ligands of a protein target out of large compound collections not used to train the model? Support vector machines, random forest, artificial neural networks, k-nearest-neighbor classification with genetic-algorithm-optimized feature selection, trend vectors, naïve Bayesian classification, and decision tree were used to divide databases into molecules predicted to be active and those predicted to be inactive. Training and predicted activities were treated as binary. The database was generated for the ligands of five different biological targets which have been the object of intense drug discovery efforts: HIV-reverse transcriptase, COX2, dihydrofolate reductase, estrogen receptor, and thrombin. We report significant differences in the performance of the methods independent of the biological target and compound class. Different methods can have different applications; some provide particularly high enrichment, others are strong in retrieving the maximum number of actives. We also show that these methods do surprisingly well in predicting recently published ligands of a target on the basis of initial leads and that a combination of the results of different methods in certain cases can improve results compared to the most consistent method.
In many cases at the beginning of an HTS-campaign, some information about active molecules is already available. Often known active compounds (such as substrate analogues, natural products, inhibitors of a related protein or ligands published by a pharmaceutical company) are identified in low-throughput validation studies of the biochemical target. In this study we evaluate the effectiveness of a support vector machine applied for those compounds and used to classify a collection with unknown activity. This approach was aimed at reducing the number of compounds to be tested against the given target. Our method predicts the biological activity of chemical compounds based on only the atom pairs (AP) two dimensional topological descriptors. The supervised support vector machine (SVM) method herein is trained on compounds from the MDL drug data report (MDDR) known to be active for specific protein target. For detailed analysis, five different biological targets were selected including cyclooxygenase-2, dihydrofolate reductase, thrombin, HIV-reverse transcriptase and antagonists of the estrogen receptor. The accuracy of compound identification was estimated using the recall and precision values. The sensitivities for all protein targets exceeded 80% and the classification performance reached 100% for selected targets. In another application of the method, we addressed the absence of an initial set of active compounds for a selected protein target at the beginning of an HTS-campaign. In such a case, virtual high-throughput screening (vHTS) is usually applied by using a flexible docking procedure. However, the vHTS experiment typically contains a large percentage of false positives that should be verified by costly and time-consuming experimental follow-up assays. The subsequent use of our machine learning method was found to improve the speed (since the docking procedure was not required for all compounds from the database) and also the accuracy of the HTS hit lists (the enrichment factor).
Computational screening of compound databases has become increasingly popular in pharmaceutical research. This review focuses on the evaluation of ligand-based virtual screening using active compounds as templates in the context of drug discovery. Ligand-based screening techniques are based on comparative molecular similarity analysis of compounds with known and unknown activity. We provide an overview of publications that have evaluated different machine learning methods, such as support vector machines, decision trees, ensemble methods such as boosting, bagging and random forests, clustering methods, neuronal networks, naïve Bayesian, data fusion methods and others.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.