2021
DOI: 10.3390/molecules26206185
|View full text |Cite
|
Sign up to set email alerts
|

Improved Lipophilicity and Aqueous Solubility Prediction with Composite Graph Neural Networks

Abstract: The accurate prediction of molecular properties, such as lipophilicity and aqueous solubility, are of great importance and pose challenges in several stages of the drug discovery pipeline. Machine learning methods, such as graph-based neural networks (GNNs), have shown exceptionally good performance in predicting these properties. In this work, we introduce a novel GNN architecture, called directed edge graph isomorphism network (D-GIN). It is composed of two distinct sub-architectures (D-MPNN, GIN) and achiev… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

1
19
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 23 publications
(20 citation statements)
references
References 15 publications
1
19
0
Order By: Relevance
“…Numerous computational approaches for the prediction of log P have been developed. , We provide a brief overview here and point toward more detailed descriptions elsewhere. ,, Computational approaches to log P prediction can be grouped into (i) empirical and (ii) physics-based methods. , Empirical methods (i) include contribution-type approaches (atom- or fragment-based , ), QSAR approaches, and deep learning approaches ,, trained on experimental data. Contribution-type approaches obtain a log P estimate by dividing molecules into either individual atoms or fragments and summing up their contributions, using correction terms in the latter case .…”
Section: Introductionmentioning
confidence: 99%
“…Numerous computational approaches for the prediction of log P have been developed. , We provide a brief overview here and point toward more detailed descriptions elsewhere. ,, Computational approaches to log P prediction can be grouped into (i) empirical and (ii) physics-based methods. , Empirical methods (i) include contribution-type approaches (atom- or fragment-based , ), QSAR approaches, and deep learning approaches ,, trained on experimental data. Contribution-type approaches obtain a log P estimate by dividing molecules into either individual atoms or fragments and summing up their contributions, using correction terms in the latter case .…”
Section: Introductionmentioning
confidence: 99%
“…As a result of regression, 18 out of 29 test sets showed R 2 > 0.80 (as can be seen in Figure S14). This suggests that solubility between unknown compounds can be still predicted using one of our models, as opposed to approaches in previous studies ,,, where individual models were trained with sufficient data for solvents.…”
Section: Results and Discussionmentioning
confidence: 99%
“…For example, DGraphDTA is a multi-input network for drug–target affinity prediction, and the combination of a GCN and graph attention networks (GATs) further improves the model. Recently, the prediction of aqueous solubility using a graph-based message passing network, directed edge graph isomorphism network, and multilevel GCN has been reported. However, in the previous studies mentioned above, it is common to predict the solubility using an individual data set and a model for one solvent in the previous studies, so there is a limitation because sufficient data is required. Therefore, in this study, a methodology for predicting the solubility between solute–solvents of various combinations was proposed.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, graph-based deep learning (DL) methods have attracted a multitude of attention and manifested remarkable effects on drug discovery ranging from molecular property prediction to drug virtual screening . These methods are capable of learning suitable molecular representations directly from chemical graphs in an end-to-end fashion. , Related works in lipophilicity and solubility prediction have verified the superiority of graph representation learning models, including undirected graph recursive neural networks (UGRNN) and a variety of graph neural networks (GNN). , …”
Section: Introductionmentioning
confidence: 99%
“…In this context, we can capitalize on prior domain knowledge to boost model performance. Since log D and log S w are highly correlated to log P , some studies have successfully attained better prediction via multitask learning. ,, In 2020, Lukashina et al introduced a substructure encoder representing functional groups as hyperatoms to provide complementary information for the directed message passing neural network (D-MPNN). The proposed StructGNN model outperformed D-MPNN on log P and log D predictions.…”
Section: Introductionmentioning
confidence: 99%