2013
DOI: 10.1186/1758-2946-5-27
|View full text |Cite
|
Sign up to set email alerts
|

Defining a novel k-nearest neighbours approach to assess the applicability domain of a QSAR model for reliable predictions

Abstract: BackgroundWith the growing popularity of using QSAR predictions towards regulatory purposes, such predictive models are now required to be strictly validated, an essential feature of which is to have the model’s Applicability Domain (AD) defined clearly. Although in recent years several different approaches have been proposed to address this goal, no optimal approach to define the model’s AD has yet been recognized.ResultsThis study proposes a novel descriptor-based AD method which accounts for the data distri… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
67
0
6

Year Published

2014
2014
2024
2024

Publication Types

Select...
7
1

Relationship

1
7

Authors

Journals

citations
Cited by 75 publications
(74 citation statements)
references
References 23 publications
1
67
0
6
Order By: Relevance
“…This is in line with the theoretical expectation that the training of the QSAR model and the calculation of the AD are two different tasks, as already explained in the “Methods” section.
Fig. 6Comparison of different feature sets used in the dk-NN AD by Sahigara et al [12], applied to the P-gp IV set
…”
Section: Resultssupporting
confidence: 83%
“…This is in line with the theoretical expectation that the training of the QSAR model and the calculation of the AD are two different tasks, as already explained in the “Methods” section.
Fig. 6Comparison of different feature sets used in the dk-NN AD by Sahigara et al [12], applied to the P-gp IV set
…”
Section: Resultssupporting
confidence: 83%
“…The average distance of the test chemical from its five nearest neighbors in the training set is compared with a threshold, which is the 95 th percentile of average distance of training chemicals from their five nearest neighbors. 63 …”
Section: Methodsmentioning
confidence: 99%
“…According to this approach a molecule is predicted only if its average distance from the first three neighbours is lower than a fixed threshold value; otherwise it is unpredicted and considered out of the applicability domain. In this way, the evaluation of the applicability domain is implicitly defined on a similarity-based approach [28][29][30] and carried out on the fly for each prediction. The introduction of the threshold is meant to identify test molecules that are dissimilar from their nearest neighbours and whose predictions are consequently supposed to be unreliable.…”
Section: Materials and Methods 21 Aquatic Toxicity Modelsmentioning
confidence: 99%