2016
DOI: 10.2174/1381612822666160509124337
|View full text |Cite
|
Sign up to set email alerts
|

Probing the Hypothesis of SAR Continuity Restoration by the Removal of Activity Cliffs Generators in QSAR

Abstract: In this work we report the first attempt to study the effect of activity cliffs over the generalization ability of machine learning (ML) based QSAR classifiers, using as study case a previously reported diverse and noisy dataset focused on drug induced liver injury (DILI) and more than 40 ML classification algorithms. Here, the hypothesis of structure-activity relationship (SAR) continuity restoration by activity cliffs removal is tested as a potential solution to overcome such limitation. Previously, a parall… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
6
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(6 citation statements)
references
References 0 publications
0
6
0
Order By: Relevance
“…Next, compounds with undetermined IC 50 (>100 µM or <0.02 µM) were removed from the modeling process and bioactivities were transformed to pIC 50 ( pIC 50 = − log 10 IC 50 ). Given the negative effect of the activity cliffs in the QSAR modeling, activity cliff generators (ACGs) were removed from the dataset [35,36]. ACGs were defined as compounds sharing a similarity greater than 65% with any other in the dataset and with pIC 50 differences greater than two in between them.…”
Section: D Qsar Modelmentioning
confidence: 99%
“…Next, compounds with undetermined IC 50 (>100 µM or <0.02 µM) were removed from the modeling process and bioactivities were transformed to pIC 50 ( pIC 50 = − log 10 IC 50 ). Given the negative effect of the activity cliffs in the QSAR modeling, activity cliff generators (ACGs) were removed from the dataset [35,36]. ACGs were defined as compounds sharing a similarity greater than 65% with any other in the dataset and with pIC 50 differences greater than two in between them.…”
Section: D Qsar Modelmentioning
confidence: 99%
“…In the field of computational chemistry, ACs are suspected to form one of the major roadblocks for successful quantitative structure-activity relationship (QSAR) modelling [ 9 , 18 , 37 , 50 ]; abrupt changes in potency are expected to negatively influence machine learning algorithms for pharmacological activity prediction. During the development of QSAR models, ACs are sometimes dismissed as measurement errors [ 39 ], but simply removing ACs from a training data set can result in a loss of precious SAR-information [ 10 ].…”
Section: Introductionmentioning
confidence: 99%
“…In the field of computational chemistry, ACs are suspected to form one of the major roadblocks for successful quantitative structure-activity relationship (QSAR) modelling [14,26,47,63]; abrupt changes in potency are expected to negatively influence machine learning algorithms for pharmacological activity prediction. During the development of QSAR models, ACs are sometimes dismissed as measurement errors [49], but simply removing ACs from a training data set can result in a loss of precious SAR-information [15].…”
Section: Introductionmentioning
confidence: 99%