2022
DOI: 10.1002/minf.202100321
|View full text |Cite
|
Sign up to set email alerts
|

Benchmarking Accuracy and Generalizability of Four Graph Neural Networks Using Large In Vitro ADME Datasets from Different Chemical Spaces

Abstract: In this work, we benchmark a variety of singleand multi-task graph neural network (GNN) models against lower-bar and higher-bar traditional machine learning approaches employing human engineered molecular features. We consider four GNN variants -Graph Convolutional Network (GCN), Graph Attention Network (GAT), Message Passing Neural Network (MPNN), and Attentive Fingerprint (AttentiveFP). So far deep learning models have been primarily benchmarked using lower-bar traditional models solely based on fingerprints… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
8
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 15 publications
(8 citation statements)
references
References 22 publications
0
8
0
Order By: Relevance
“…Numerous computational approaches for the prediction of log P have been developed. , We provide a brief overview here and point toward more detailed descriptions elsewhere. ,, Computational approaches to log P prediction can be grouped into (i) empirical and (ii) physics-based methods. , Empirical methods (i) include contribution-type approaches (atom- or fragment-based , ), QSAR approaches, and deep learning approaches ,, trained on experimental data. Contribution-type approaches obtain a log P estimate by dividing molecules into either individual atoms or fragments and summing up their contributions, using correction terms in the latter case .…”
Section: Introductionmentioning
confidence: 99%
“…Numerous computational approaches for the prediction of log P have been developed. , We provide a brief overview here and point toward more detailed descriptions elsewhere. ,, Computational approaches to log P prediction can be grouped into (i) empirical and (ii) physics-based methods. , Empirical methods (i) include contribution-type approaches (atom- or fragment-based , ), QSAR approaches, and deep learning approaches ,, trained on experimental data. Contribution-type approaches obtain a log P estimate by dividing molecules into either individual atoms or fragments and summing up their contributions, using correction terms in the latter case .…”
Section: Introductionmentioning
confidence: 99%
“…Although the GSE is easy to use and requires no training, it is based on certain assumptions and performs poorly on large molecules. , Additionally, since the experimental data of log P and p K a is scarce, the calculation of log D or log S w using predicted values usually leads to amplified errors . Another solution is using statistical machine learning (ML) algorithms to train a customized model, such as Gaussian Process Regression, Support Vector Machine, Random Forest, ExtraTrees, eXtreme Gradient Boosting (XGBoost), and Deep Neural Networks. , The descriptor-based models have made remarkable progress in molecular property prediction, but they heavily rely on expert knowledge for the design and selection of informative descriptors, possibly introducing expert bias . Moreover, given the fixed-size vectors as input, the tailored models cannot fully utilize molecular structure information, easily causing overfitting and limited generalization capability to unseen data.…”
Section: Introductionmentioning
confidence: 99%
“…Recently, graph-based deep learning (DL) methods have attracted a multitude of attention and manifested remarkable effects on drug discovery ranging from molecular property prediction to drug virtual screening . These methods are capable of learning suitable molecular representations directly from chemical graphs in an end-to-end fashion. , Related works in lipophilicity and solubility prediction have verified the superiority of graph representation learning models, including undirected graph recursive neural networks (UGRNN) and a variety of graph neural networks (GNN). , …”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Whereas image recognition techniques per se have not been widely adopted for bioactivity prediction problems [8,17], the development of e cient molecular feature extraction methods can roughly be divided into a static, structure-based descriptor method that encodes atom and bond features [18][19][20][21][22] and a dynamic graph neural network (GNN) approach that learns molecular features within the context of the data to be modeled [23][24][25][26]. This latter method represents an e cient end-to-end deep learning method as the learnt, extracted features capture all features important to the modeled data, but may not be applicable for other datasets [27].…”
Section: Introductionmentioning
confidence: 99%