A computer-based method was developed for rapid and automatic identification of potential "frequent hitters". These compounds show up as hits in many different biological assays covering a wide range of targets. A scoring scheme was elaborated from substructure analysis, multivariate linear and nonlinear statistical methods applied to several sets of one and two-dimensional molecular descriptors. The final model is based on a three-layered neural network, yielding a predictive Matthews correlation coefficient of 0.81. This system was able to correctly classify 90% of the test set molecules in a 10-times cross-validation study. The method was applied to database filtering, yielding between 8% (compilation of trade drugs) and 35% (Available Chemicals Directory) potential frequent hitters. This filter will be a valuable tool for the prioritization of compounds from large databases, for compound purchase and biological testing, and for building new virtual libraries.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.