Inspired by natural language processing techniques, we here introduce Mol2vec, which is an unsupervised machine learning approach to learn vector representations of molecular substructures. Like the Word2vec models, where vectors of closely related words are in close proximity in the vector space, Mol2vec learns vector representations of molecular substructures that point in similar directions for chemically related substructures. Compounds can finally be encoded as vectors by summing the vectors of the individual substructures and, for instance, be fed into supervised machine learning approaches to predict compound properties. The underlying substructure vector embeddings are obtained by training an unsupervised machine learning approach on a so-called corpus of compounds that consists of all available chemical matter. The resulting Mol2vec model is pretrained once, yields dense vector representations, and overcomes drawbacks of common compound feature representations such as sparseness and bit collisions. The prediction capabilities are demonstrated on several compound property and bioactivity data sets and compared with results obtained for Morgan fingerprints as a reference compound representation. Mol2vec can be easily combined with ProtVec, which employs the same Word2vec concept on protein sequences, resulting in a proteochemometric approach that is alignment-independent and thus can also be easily used for proteins with low sequence similarities.
BackgroundAnnotations of the phylogenetic tree of the human kinome is an intuitive way to visualize compound profiling data, structural features of kinases or functional relationships within this important class of proteins. The increasing volume and complexity of kinase-related data underlines the need for a tool that enables complex queries pertaining to kinase disease involvement and potential therapeutic uses of kinase inhibitors.ResultsHere, we present KinMap, a user-friendly online tool that facilitates the interactive navigation through kinase knowledge by linking biochemical, structural, and disease association data to the human kinome tree. To this end, preprocessed data from freely-available sources, such as ChEMBL, the Protein Data Bank, and the Center for Therapeutic Target Validation platform are integrated into KinMap and can easily be complemented by proprietary data. The value of KinMap will be exemplarily demonstrated for uncovering new therapeutic indications of known kinase inhibitors and for prioritizing kinases for drug development efforts.ConclusionKinMap represents a new generation of kinome tree viewers which facilitates interactive exploration of the human kinome. KinMap enables generation of high-quality annotated images of the human kinome tree as well as exchange of kinome-related data in scientific communications. Furthermore, KinMap supports multiple input and output formats and recognizes alternative kinase names and links them to a unified naming scheme, which makes it a useful tool across different disciplines and applications. A web-service of KinMap is freely available at http://www.kinhub.org/kinmap/.Electronic supplementary materialThe online version of this article (doi:10.1186/s12859-016-1433-7) contains supplementary material, which is available to authorized users.
Butyrylcholinesterase (BChE) is regarded as a promising drug target as its levels and activity significantly increase in the late stages of Alzheimer's disease. To discover novel BChE inhibitors, we used a hierarchical virtual screening protocol followed by biochemical evaluation of 40 highest scoring hit compounds. Three of the compounds identified showed significant inhibitory activities against BChE. The most potent, compound 1 (IC50 = 21.3 nM), was resynthesized and resolved into its pure enantiomers. A high degree of stereoselective activity was revealed, and a dissociation constant of 2.7 nM was determined for the most potent stereoisomer (+)-1. The crystal structure of human BChE in complex with compound (+)-1 was solved, revealing the binding mode and providing clues for potential optimization. Additionally, compound 1 inhibited amyloid β(1-42) peptide self-induced aggregation into fibrils (by 61.7% at 10 μM) and protected cultured SH-SY5Y cells against amyloid-β-induced toxicity. These data suggest that compound 1 represents a promising candidate for hit-to-lead follow-up in the drug-discovery process against Alzheimer's disease.
Predicting the endocrine disruption potential of compounds is a daunting but essential task. Here we report a new tool for this purpose that we have termed Endocrine Disruptome. It is a free and simple-to-use Web service that runs on an open source platform called Docking interface for Target Systems (DoTS). The molecular docking is handled via AutoDock Vina. Compounds are docked to 18 integrated and well-validated crystal structures of 14 different human nuclear receptors: androgen receptor; estrogen receptors α and β; glucocorticoid receptor; liver X receptors α and β; mineralocorticoid receptor; peroxisome proliferator activated receptors α, β/δ, and γ; progesterone receptor; retinoid X receptor α; and thyroid receptors α and β. Endocrine Disruptome is free of charge and available at http://endocrinedisruptome.ki.si.
Kinome-wide screening would have the advantage of providing structure-activity relationships against hundreds of targets simultaneously. Here, we report the generation of ligand-based activity prediction models for over 280 kinases by employing Machine Learning methods on an extensive data set of proprietary bioactivity data combined with open data. High quality (AUC > 0.7) was achieved for ∼200 kinases by (1) combining open with proprietary data, (2) choosing Random Forest over alternative tested Machine Learning methods, and (3) balancing the training data sets. Tests on left-out and external data indicate a high value for virtual screening projects. Importantly, the derived models are evenly distributed across the kinome tree, allowing reliable profiling prediction for all kinase branches. The prediction quality was further improved by employing experimental bioactivity fingerprints of a small kinase subset. Overall, the generated models can support various hit identification tasks, including virtual screening, compound repurposing, and the detection of potential off-targets.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.