2007
DOI: 10.1021/ci600476r
|View full text |Cite
|
Sign up to set email alerts
|

Stochastic versus Stepwise Strategies for Quantitative Structure−Activity Relationship GenerationHow Much Effort May the Mining for Successful QSAR Models Take?

Abstract: Descriptor selection in QSAR typically relies on a set of upfront working hypotheses in order to boil down the initial descriptor set to a tractable size. Stepwise regression, computationally cheap and therefore widely used in spite of its potential caveats, is most aggressive in reducing the effectively explored problem space by adopting a greedy variable pick strategy. This work explores an antipodal approach, incarnated by an original Genetic Algorithm (GA)-based Stochastic QSAR Sampler (SQS) that favors un… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

4
57
0

Year Published

2008
2008
2014
2014

Publication Types

Select...
4
2

Relationship

2
4

Authors

Journals

citations
Cited by 34 publications
(61 citation statements)
references
References 39 publications
4
57
0
Order By: Relevance
“…Nonlinear approaches occupy the top position of the best validating model in eight out of 13 cases, an observation reinforcing the already reported 25 trend of improving model validation propensity upon allowing SQS to employ predefined nonlinear transformations.…”
Section: D-fpt-based Models Compare Favorably With Respect To Publissupporting
confidence: 78%
See 4 more Smart Citations
“…Nonlinear approaches occupy the top position of the best validating model in eight out of 13 cases, an observation reinforcing the already reported 25 trend of improving model validation propensity upon allowing SQS to employ predefined nonlinear transformations.…”
Section: D-fpt-based Models Compare Favorably With Respect To Publissupporting
confidence: 78%
“…In principle, due to the stochastic nature of the model builder, it is risky to extrapolate the intrinsic quality of descriptors from validation score differences of single models. However, the analysis of the 39 duplicated SQS simulations for all 13 data sets, performed with D, D-R, and D-S descriptors, respectively, showed that duplicate simulations generate significantly diverging representative sets 25 a 1 -Validation criteria for the globally optimal, best validating 2D-FPT models: RMSPE (plain text), validation set R 2 V (italics), and percentage of correctly classified inactives (bold) in an additional inactive validation set, except for (a), reporting the overall correct classification rate of both validation set actives and inactives. 2 -2D-FPT setup and nonlinearity policy (linear, nonlinear) leading to results (1).…”
Section: D-fpt Setup-dependence Of the Validation Performancementioning
confidence: 98%
See 3 more Smart Citations