The 13 C NMR chemical shift of sp 3 carbon atoms situated in the R position relative to the double bond in acyclic alkenes was estimated with multilayer feedforward artificial neural networks (ANNs) and multilinear regression (MLR), using as structural descriptors a topo-stereochemical code which characterizes the environment of the resonating carbon atom. The predictive ability of the two models was tested by the leave-20%-out cross-validation method. The neural model provides better results than the MLR model both in calibration and in cross-validation, demonstrating that there exists a nonlinear relationship between the structural descriptors and the investigated 13 C NMR chemical shift and that the neural model is capable to capture such a relationship in a simple and effective way. A comparison between a general model for the estimation of the 13 C NMR chemical shift and the ANN model indicates that general models are outperformed by more specific models, and in order to improve the predictions a possible way is to develop environment-specific models. The approach proposed in this paper can be used in automated spectra interpretation or computer-assisted structure elucidation to constrain the number of possible candidates generated from the experimental spectra.
The 13 C NMR chemical shift of sp 2 carbon atoms in acyclic alkenes was estimated with multilayer feedforward artificial neural networks (ANNs) and multilinear regression (MLR), using as structural descriptors a vector made of 12 components encoding the environment of the resonating carbon atom. The neural network quantitative model provides better results than the MLR model calibrated with the same data. The predictive ability of both the ANN and MLR models was tested by the leave-20%-out (L20%O) cross-validation method, demonstrating the superior performance of the neural model. The number of neurons in the hidden layer was varied between 2 and 7, and three activation functions were tested in the neural model: the hyperbolic tangent or a bell-shaped function for the hidden layer and a linear or a hyperbolic tangent function for the output layer. All four combinations of activation functions give close results in the calibration of the ANN model, while for the prediction a linear output function performs better than a hyperbolic tangent one, but from a statistical point of view one could not choose a particular combination against the others. For the ANNs with four neurons in the hidden layer, the standard deviation for calibration ranges between 0.59 and 0.63 ppm, while for prediction it lies between 0.89 and 1.07 ppm. We propose a parallel use of the four ANNs for the prediction of unknown shifts, because the mean of the four predictions exhibit a smaller number of outliers with lower residuals. The present model is compared with three additive schemes for the calculation of the sp 2 13 C NMR chemical shifts, and the statistical analysis of the results demonstrates that the ANN model gives better predictions than the classical ones.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.