Fast accurate predictions of 1H NMR spectra of organic compounds play an important role in structure validation, automatic structure elucidation, or calibration of chemometric methods. The SPINUS program is a feed-forward neural network (FFNN) system developed over the last 8 years for the prediction of 1H NMR properties from the molecular structure. It was trained using a series of empirical proton descriptors. Ensembles of FFNNs were incorporated into Associative Neural Networks (ASNN), which correct a prediction on the basis of the observed errors for the k nearest neighbors in an additional memory. Here we show a procedure to estimate coupling constants with the ASNNs trained for chemical shifts-a second memory is linked consisting of coupled protons and their experimental coupling constants. An ASNN finds the pairs of coupled protons most similar to a query, and these are used to estimate coupling constants. Using a diverse general data set of 618 coupling constants, mean absolute errors of 0.6-0.8 Hz could be achieved in different experiments. A Web interface for 1H NMR full-spectrum prediction is available at http://www.dq.fct.unl.pt/spinus.
Feed-forward neural networks were trained for the general prediction of 1H NMR chemical shifts of CH(n) protons in organic compounds in CDCl3. The training set consisted of 744 1H NMR chemical shifts from 120 molecular structures. The method was optimized in terms of selected proton descriptors (selection of variables), the number of hidden neurons, and integration of different networks in ensembles. Predictions were obtained for an independent test set of 952 cases with a mean average error of 0.29 ppm (0.20 ppm for 90% of the cases). The results were significantly better than those obtained with counterpropagation neural networks.
Two different ways were explored to incorporate new available experimental data into previously trained ensembles of feed-forward neural networks, for the structure-based prediction of (1)H NMR chemical shifts of organic compounds. One approach used the new data as the memory of an associative neural network (ASNN) system. For an independent prediction set of 952 cases, a mean average error of 0.19 ppm was achieved (0.13 ppm for 90% of the cases). This approach advantageously avoids retraining the networks, and the predictions compared favorably with those obtained by available commercial software packages. Excellent predictions could also be achieved by retraining the networks with the new data, but only if the training sets were selected so as to be balanced or if the retraining started with the weights of the previously trained networks.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.