A new hybrid method (GNN) combining a genetic algorithm and an artificial neural network has been developed for quantitative structure-activity relationship (QSAR) studies. A suitable set of molecular descriptors are selected by a genetic algorithm. This set serves as input to a neural network, in which model-free mapping of multivariate data is performed. Multiple predictors are generated that are superior to results obtained from previous studies of the Selwood data set, which is used to test the method. The neural network technique provides a graphical description of the functional form of the descriptors that play an important role in determining drug activity. This can serve as an aid in future design of drug analogues. The effectiveness of GNN is tested by comparing its results with a benchmark obtained by exhaustive enumeration. Different fitness strategies that tune the evolution of genetic models are examined, and QSARs with higher predictiveness are found. From these results, a composite model is constructed by averaging predictions from several high-ranking models. The predictions of the resulting QSAR should be more reliable than those derived from a single predictor because it makes greater use of information and also permits error estimation. An analysis of the sets of descriptors selected by GNN shows that it is essential to have one each for the steric, electrostatic, and hydrophobic attributes of a drug candidate to obtain a satisfactory QSAR for this data set. This type of result is expected to be of general utility in designing and understanding QSAR.
The artificial neural network (ANN), or simply neural network, is a machine learning method evolved from the idea of simulating the human brain. The data explosion in modem drug discovery research requires sophisticated analysis methods to uncover the hidden causal relationships between single or multiple responses and a large set of properties. The ANN is one of many versatile tools to meet the demand in drug discovery modeling. Compared to a traditional regression approach, the ANN is capable of modeling complex nonlinear relationships. The ANN also has excellent fault tolerance and is fast and highly scalable with parallel processing. This chapter introduces the background of ANN development and outlines the basic concepts crucially important for understanding more sophisticated ANN. Several commonly used learning methods and network setups are discussed briefly at the end of the chapter.
An alternative method for determining structure-activity correlations is presented. Ligand molecules are described using data matrices derived from the results of N by N (each molecule compared to every other) molecular similarity calculations. The matrices were analyzed using a neural network pattern recognition technique and partial least squares statistics, with the results obtained compared to those achieved using comparative molecular field analysis (CoMFA). The molecular series used in the study comprised 31 steroids. The resultant pattern recognition analysis showed clustering of compounds with high, intermediate, and low affinity into separate regions of the neuron output plots. The cross-validated correlation coefficients obtained from statistical analyses of the matrices against steroid binding data compared well with those achieved using CoMFA. These results show that data matrices derived from molecular similarity calculations can provide the basis for rapid elucidation of both qualitative and quantitative structure-activity relationships.
A novel tool, called a genetic neural network (GNN), has been developed for obtaining quantitative structure-activity relationships (QSAR) for high-dimensional data sets (J. Med. Chem. 1996, 39, 1521-1530). The GNN method uses a neural network to correlate activity with descriptors that are preselected by a genetic algorithm. To provide an extended test of the GNN method, the data on 57 benzodiazepines given by Maddalena and Johnston (MJ; J. Med. Chem. 1995, 38, 715-724) have been examined with an enhanced version of GNN, and the results are compared with the excellent QSAR of MJ. The problematic steepest descent training has been replaced by the scaled conjugate gradient algorithm. This leads to a substantial gain in performance in both robustness of prediction and speed of computation. The cross-validation GNN simulation and the subsequent run based on an unbiased and more efficient protocol led to the discovery of other 10-descriptor QSARs that are superior to the best model of MJ based on backward elimination selection and neural network training. Results from a series of GNNs with a different number of inputs showed that a neural network with fewer inputs can produce QSARs as good as or even better than those with higher dimensions. The top-ranking models from a GNN simulation using only six input descriptors are presented, and the chemical significance of the chosen descriptors is discussed. The statistical significance of these GNN QSARs is validated. The best QSARs are used to provide a graphical tool that aids the design of new drug analogues. By replacing functional groups at the 7- and 2'-positions with ones that have optimal substituent parameters, a number of new benzodiazepines with high potency are predicted.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.