The prediction of a molecule’s solvation Gibbs
free (ΔG
solv) energy in a given
solvent is an important
task which has traditionally been carried out via quantum chemical
continuum methods or force field-based molecular simulations. Machine
learning (ML) and graph neural networks in particular have emerged
as powerful techniques for elucidating structure–property relationships.
This work presents a graph neural network (GNN) for the prediction
of ΔG
solv which, in addition to
encoding typical atom and bond-level features, incorporates chemically
intuitive, solvation-relevant parameters into the featurization process:
semiempirical partial atomic charges and solvent dielectric constant.
Solute–solvent interactions are included via an interaction
map layer which can be visualized to examine solubility-enhancing
or -decreasing interactions learnt by the model. On a test set of
small organic molecules, our GNN predicts ΔG
solv in water and cyclohexane with an accuracy comparable
to polarizable and ab initio generated force field methods [mean absolute
error (MAE) = 0.4 and 0.2 kcal mol–1, respectively],
without the need for any molecular simulation. For the FreeSolv data
set of hydration free energies, the test MAE is 0.7 kcal mol–1. Interpretability and applicability of the model is highlighted
through several examples including rationalizing the increased solubility
of modified diaminoanthraquinones in organic solvents. The clear explanations
afforded by our GNN allow for easy understanding of the model’s
predictions, giving the experimental chemist confidence in employing
ML models toward more optimized synthetic routes.