In this paper, we report on the potential of a recently developed neural network for structures applied to the prediction of physical chemical properties of compounds. The proposed recursive neural network (RecNN) model is able to directly take as input a structured representation of the molecule and to model a direct and adaptive relationship between the molecular structure and target property. Therefore, it combines in a learning system the flexibility and general advantages of a neural network model with the representational power of a structured domain. As a result, a completely new approach to quantitative structure-activity relationship/ quantitative structure-property relationship (QSPR/QSAR) analysis is obtained. An original representation of the molecular structures has been developed accounting for both the occurrence of specific atoms/groups and the topological relationships among them. Gibbs free energy of solvation in water, ∆ solv G°, has been chosen as a benchmark for the model. The different approaches proposed in the literature for the prediction of this property have been reconsidered from a general perspective. The advantages of RecNN as a suitable tool for the automatization of fundamental parts of the QSPR/QSAR analysis have been highlighted. The RecNN model has been applied to the analysis of the ∆ solv G°in water of 138 monofunctional acyclic organic compounds and tested on an external data set of 33 compounds. As a result of the statistical analysis, we obtained, for the predictive accuracy estimated on the test set, correlation coefficient R ) 0.9985, standard deviation S ) 0.68 kJ mol -1 , and mean absolute error MAE ) 0.46 kJ mol -1 . The inherent ability of RecNN to abstract chemical knowledge through the adaptive learning process has been investigated by principal components analysis of the internal representations computed by the network. It has been found that the model recognizes the chemical compounds on the basis of a nontrivial combination of their chemical structure and target property.
Abstract-We introduce a kernel for structured data, which is an extension of the Fisher Kernel used for sequences [11]. In our approach, we extract the Fisher score vectors from a Bayesian Network, specifically a Hidden Tree Markov Model [6], which can be constructed starting from the training data. Experiments on a QSPR (quantitative structure-property relationship) analysis, where instances are naturally represented as trees, allow a first test of the approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.