2023
DOI: 10.1186/s13321-023-00708-w
|View full text |Cite
|
Sign up to set email alerts
|

Exploring QSAR models for activity-cliff prediction

Abstract: Introduction and methodology Pairs of similar compounds that only differ by a small structural modification but exhibit a large difference in their binding affinity for a given target are known as activity cliffs (ACs). It has been hypothesised that QSAR models struggle to predict ACs and that ACs thus form a major source of prediction error. However, the AC-prediction power of modern QSAR methods and its quantitative relationship to general QSAR-prediction performance is still underexplored. W… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

1
14
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6
1

Relationship

0
7

Authors

Journals

citations
Cited by 16 publications
(15 citation statements)
references
References 48 publications
1
14
0
Order By: Relevance
“…Among them, graph neural networks (GNNs) that utilize graph-based representations have gained signi cant popularity in predicting molecular properties 3,5,6 , as they enable a more direct and interpretable capture of the structural and relational information of molecules compared to traditional descriptors/ ngerprints-based ML models. However, contrary to common belief, recent studies have shown that traditional ML methods based on descriptors/ ngerprints outperform state-of-the-art end-to-end DL approaches in molecular property prediction [9][10][11] , particularly in the prediction of molecular activity with activity cliffs (ACs) for GNN models 10,11 . For example, van Tilborg et al (2022) 10 conducted a comprehensive study comparing the performance of graph-based DL models (GAT, GCN, AFP, and MPNN), SMILES string-based DL models (Transformer, CNN, and LSTM), and ngerprint-based ML models (MLP, KNN, GBM, RF, and SVM) on 30 activity benchmark datasets.…”
mentioning
confidence: 78%
“…Among them, graph neural networks (GNNs) that utilize graph-based representations have gained signi cant popularity in predicting molecular properties 3,5,6 , as they enable a more direct and interpretable capture of the structural and relational information of molecules compared to traditional descriptors/ ngerprints-based ML models. However, contrary to common belief, recent studies have shown that traditional ML methods based on descriptors/ ngerprints outperform state-of-the-art end-to-end DL approaches in molecular property prediction [9][10][11] , particularly in the prediction of molecular activity with activity cliffs (ACs) for GNN models 10,11 . For example, van Tilborg et al (2022) 10 conducted a comprehensive study comparing the performance of graph-based DL models (GAT, GCN, AFP, and MPNN), SMILES string-based DL models (Transformer, CNN, and LSTM), and ngerprint-based ML models (MLP, KNN, GBM, RF, and SVM) on 30 activity benchmark datasets.…”
mentioning
confidence: 78%
“…QSAR models, which assume that similar compounds have similar toxicity outcomes, often struggle to accurately predict activity cliffs (ACs) where this assumption does not hold. ACs refer to pairs of small molecules that share high structural similarity but display a significant difference in their binding affinity toward a specific pharmacological target . Despite the difficulties posed by ACs, they provide a substantial understanding of the structure–activity relationship (SAR), which holds great value.…”
Section: Qsar and Its Limitationsmentioning
confidence: 99%
“…For example, AC prediction by QSAR models generally improved when one compound’s activity is known . Also, graph isomorphism features performed competitively for AC classification, while extended-connectivity fingerprints were best for general QSAR prediction . Traditional methods use a fixed potency difference, regardless of the target.…”
Section: Qsar and Its Limitationsmentioning
confidence: 99%
See 1 more Smart Citation
“…Beginning in 2012, various attempts have been made to predict ACs. Most of these studies have attempted to predict compound pairs forming ACs (often applying the MMP-cliff definition) and distinguish them from pairs of compounds with small potency differences. , For these purposes, ML classification models were used, often producing high accuracy in distinguishing ACs from other pairs of similar compounds. Recently, DNN variants have been used to predict ACs from molecular images , or graphs using representation learning. , By contrast, only few attempts have thus far been made to predict the actual potency value of AC compounds using ML regression models and/or DNNs of varying complexity. , The results of the currently most comprehensive study have confirmed the challenges in accurately predicting the potency of AC compounds and have shown that standard ML regression models yielded overall better performance than DNNs …”
Section: Introductionmentioning
confidence: 99%