Simple physico-chemical properties, like logD, solubility, or melting point, can reveal a great deal about how a compound under development might later behave. These data are typically measured for most compounds in drug discovery projects in a medium throughput fashion. Collecting and assembling all the Bayer in-house data related to these properties allowed us to apply powerful machine learning techniques to predict the outcome of those assays for new compounds. In this paper, we report our finding that, especially for predicting physicochemical ADMET endpoints, a multitask graph convolutional approach appears a highly competitive choice. For seven endpoints of interest, we compared the performance of that approach to fully connected neural networks and different single task models. The new model shows increased predictive performance compared to previous modeling methods and will allow early prioritization of compounds even before they are synthesized. In addition, our model follows the generalized solubility equation without being explicitly trained under this constraint.
For oral drugs, medicinal chemists aim to design compounds with high oral bioavailability, of which permeability is a key determinant. Taking advantage of >2000 compounds tested in rat bioavailability studies and >20,000 compounds tested in Caco2 assays at Bayer, we have examined the molecular properties governing bioavailability and permeability. In addition to classical parameters such as logD and molecular weight, we also investigated the relationship between calculated pK a and permeability. We find that neutral compounds retain permeability up to a molecular weight limit of 700, while stronger acids and bases are restricted to weights of 400−500. We also investigate trends for common properties such as hydrogen bond donors and acceptors, polar surface area, aromatic ring count, and rotatable bonds, including compounds which exceed Lipinski's rule of five (Ro5). These property−structure relationships are combined to provide design guidelines for bioavailable drugs in both traditional and "beyond rule of 5" (bRo5) chemical space.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.