The Sphere Exclusion algorithm is a well-known algorithm used to select diverse subsets from chemical-compound libraries or collections. It can be applied with any given distance measure between two structures. It is popular because of the intuitive geometrical interpretation of the method and its good performance on large data sets. This paper describes Directed Sphere Exclusion (DISE), a modification of the Sphere Exclusion algorithm, which retains all positive properties of the Sphere Exclusion algorithm but generates a more even distribution of the selected compounds in the chemical space. In addition, the computational requirement is significantly reduced, thus it can be applied to very large data sets.
Structure- and property-based drug design is an integral part of modern drug discovery, enabling the design of compounds aimed at improving potency and selectivity. However, building molecules using desktop modeling tools can easily lead to poor designs that appear to form many favorable interactions with the protein's active site. Although a proposed molecule looks good on screen and appears to fit into the protein site X-ray crystal structure or pharmacophore model, doing so might require a high-energy small molecule conformation, which would likely be inactive. To help scientists make better design decisions, we have built integrated, easy-to-use, interactive software tools to perform docking experiments, de novo design, shape and pharmacophore based database searches, small molecule conformational analysis and molecular property calculations. Using a combination of these tools helps scientists in assessing the likelihood that a designed molecule will be active and have desirable drug metabolism and pharmacokinetic properties. Small molecule discovery success requires project teams to rapidly design and synthesize potent molecules with good ADME properties. Empowering scientists to evaluate ideas quickly and make better design decisions with easy-to-access and easy-to-understand software on their desktop is now a key part of our discovery process.
Using data from the in vitro liver microsomes metabolic stability assay, we have developed QSAR models to predict in vitro human clearance. Models were trained using in house high-throughput assay data reported as the predicted human hepatic clearance by liver microsomes or pCLh. Machine learning regression methods were used to generate the models. Model output for a given molecule was reported as its probability of being metabolically stable, thus allowing for synthesis prioritization based on this prediction. Use of probability, instead of the regression value or categories, has been found to be an efficient way for both reporting and assessing predictions. Model performance is evaluated using prospective validation. These models have been integrated into a number of desktop tools, and the models are routinely used to prioritize the synthesis of compounds. We discuss two therapeutic projects at Genentech that exemplify the benefits of a probabilistic approach in applying the models. A three-year retrospective analysis of measured liver microsomes stability data on all registered compounds at Genentech reveals that the use of these models has resulted in an improved metabolic stability profile of synthesized compounds.
BackgroundAfter performing a fragment based screen the resulting hits need to be prioritized for follow-up structure elucidation and chemistry. This paper describes a new similarity metric, Atom-Atom-Path (AAP) similarity that is used in conjunction with the Directed Sphere Exclusion (DISE) clustering method to effectively organize and prioritize the fragment hits. The AAP similarity rewards common substructures and recognizes minimal structure differences. The DISE method is order-dependent and can be used to enrich fragments with properties of interest in the first clusters.ResultsThe merit of the software is demonstrated by its application to the MAP4K4 fragment screening hits using ligand efficiency (LE) as quality measure. The first clusters contain the hits with the highest LE. The clustering results can be easily visualized in a LE-over-clusters scatterplot with points colored by the members’ similarity to the corresponding cluster seed. The scatterplot enables the extraction of preliminary SAR.ConclusionsThe detailed structure differentiation of the AAP similarity metric is ideal for fragment-sized molecules. The order-dependent nature of the DISE clustering method results in clusters ordered by a property of interest to the teams. The combination of both allows for efficient prioritization of fragment hit for follow-ups.Graphical abstractAAP similarity computation and DISE clustering visualization.Electronic supplementary materialThe online version of this article (doi:10.1186/s13321-015-0056-8) contains supplementary material, which is available to authorized users.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.