2003
DOI: 10.1002/qua.10591
|View full text |Cite
|
Sign up to set email alerts
|

Synergistic interactions among QSAR descriptors

Abstract: Quantitative structure-activity relationships (QSARs) and quantitative structure-property relationships (QSPRs) rely on regression equations containing numerical descriptors of molecular structure. In constructing these models, highly correlated descriptors are sometimes excluded from the regression equations. Although this exclusion seems reasonable, in fact it can lead investigators to overlook significant descriptor combinations, because the small differences between highly correlated descriptors sometimes … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
25
0

Year Published

2004
2004
2023
2023

Publication Types

Select...
8

Relationship

0
8

Authors

Journals

citations
Cited by 35 publications
(26 citation statements)
references
References 17 publications
1
25
0
Order By: Relevance
“…PCA has produced model results that are statistically similar to the variable ranking approach as considered by random forests, yet, PCA still requires the computation of all 1485 descriptors for its application which is a relevant shortcoming. The fact that the results produced by PCA and variable ranking approach as considered by random forests are similar is an evidence, as also argued by some authors [51], that the effects of correlation between descriptors mostly affects the interpretation of the model, with only slight effect on its predictive power. Thus the random forest based variable ranking approach is the natural choice for a final model, which, for the present problem, is able to reach robust models using only 89 molecular descriptors.…”
Section: Resultssupporting
confidence: 67%
“…PCA has produced model results that are statistically similar to the variable ranking approach as considered by random forests, yet, PCA still requires the computation of all 1485 descriptors for its application which is a relevant shortcoming. The fact that the results produced by PCA and variable ranking approach as considered by random forests are similar is an evidence, as also argued by some authors [51], that the effects of correlation between descriptors mostly affects the interpretation of the model, with only slight effect on its predictive power. Thus the random forest based variable ranking approach is the natural choice for a final model, which, for the present problem, is able to reach robust models using only 89 molecular descriptors.…”
Section: Resultssupporting
confidence: 67%
“…The model of the experimental lattice and binding enthalpies has in the K p − ( pp − odd) − 0 ψ I index its best single descriptor, and in the K p − ( pp − odd) couple of indices, { 0 ψ I , 1 ψ I } the best two‐index descriptors. These two indices have a correlation of r = 0.976, which, at the light of research on the subject should not be considered dramatic 33–36. The good X term, which describes these properties, includes the 0 χ v index together with the D v index.…”
Section: Discussionmentioning
confidence: 99%
“…Moreover, since the computation of all the quantum‐mechanical descriptors depends on the orbital energies, a correlation between them is already expected. Therefore, any other criterion, for example, widely used variance inflation factor also becomes redundant here but only for the models with high predictivity as has also been highlighted by Peterangelo and Seybold …”
Section: Methodsmentioning
confidence: 99%