The concept of drug-likeness, an important characteristic for any compound in a screening library, is nevertheless difficult to pin down. Based on our belief that this concept is implicit within the collective experience of working chemists, we devised a data set to capture an intuitive human understanding of both this characteristic and ease of synthesis, a second key characteristic. Five chemists assigned a pair of scores to each of 3980 diverse compounds, with the component scores of each pair corresponding to drug-likeness and ease of synthesis, respectively. Using this data set, we devised binary classifiers with an artificial neural network and a support vector machine. These models were found to efficiently eliminate compounds that are not drug-like and/or hard-to-synthesize derivatives, demonstrating the suitability of these models for use as compound acquisition filters.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.