1997
DOI: 10.1080/10629369708039124
|View full text |Cite
|
Sign up to set email alerts
|

New Developments in QSPR/QSAR Modeling Based on Topological Indices

Abstract: An efficient algorithm for deriving QSPR/QSAR models with nonorthogonal and ordered orthogonal descriptors, based on orthogonalization of topological indices, is presented. It is applied to structure-boiling point modeling of nonanes as the test case. The selection of the best descriptors from multivariate linear regression modeling is carried out using descriptors which are first orthogonalized. It is shown that such an algorithm is applicable for the selection of the best descriptors in a multivariate linear… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
13
0

Year Published

1998
1998
2015
2015

Publication Types

Select...
6
2
1

Relationship

0
9

Authors

Journals

citations
Cited by 21 publications
(14 citation statements)
references
References 41 publications
1
13
0
Order By: Relevance
“…Thus, by the use of CROMRsel.f we can select the best model with six descriptors among 10 9 possible models (it takes about 10 h on Hewlett-Packard 9000/E55 computer, which is configured as a server) or the best model with five out of 104 descriptors (∼10 8 models, what takes 28 CPU min). Therefore, if we wish to express a certain physical or chemical property, or biological activity of a group of molecules as linear combination of descriptors, the problem we face is the selection of a set of I descriptors ( I = 1, ..., N ) from the set of N descriptors 11,26 which best approximate a given property or activity. , This problem was considered by a number of authors 6 but perhaps most consistently by Randić; however, they gave no instructions (algorithm, computer program) how to solve any problem of real complexity (selection of descriptors in a large descriptor space).…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Thus, by the use of CROMRsel.f we can select the best model with six descriptors among 10 9 possible models (it takes about 10 h on Hewlett-Packard 9000/E55 computer, which is configured as a server) or the best model with five out of 104 descriptors (∼10 8 models, what takes 28 CPU min). Therefore, if we wish to express a certain physical or chemical property, or biological activity of a group of molecules as linear combination of descriptors, the problem we face is the selection of a set of I descriptors ( I = 1, ..., N ) from the set of N descriptors 11,26 which best approximate a given property or activity. , This problem was considered by a number of authors 6 but perhaps most consistently by Randić; however, they gave no instructions (algorithm, computer program) how to solve any problem of real complexity (selection of descriptors in a large descriptor space).…”
Section: Methodsmentioning
confidence: 99%
“…Therefore, the quality of the MR method was usually misjudged since the critical opinion was reached by considering models which were not the best possible MR models that could be obtained (except for very small sets of descriptors). We have shown in our previous reports that by selecting the best possible descriptors to be used in MR modeling one obtains better models than those obtained using the usual approximate procedures for choosing descriptors. Additionally, the selection of the best possible descriptors increases the stability of the coefficients in the MR model and thus the accuracy of the model also increases. In addition, all methods (NN, PLS, PCA) other than MR (except pure GA with a MR-like model) are not easy to relate from equation to equation, because relationships between the chemical structure and the activity of molecules are much more complex than in the case of MR (expressed by latent variables which vary from a model to model) …”
Section: Introductionmentioning
confidence: 99%
“…A major limitation of the CODESSA modeling subroutine, in common with other approaches for variable selection, is, for example, the impossibility of selecting the best possible two or three descriptors for MR models from a data set containing, for example, 200 descriptors, for which case the numbers of possible two- and three-descriptor models are 4950 and 161 700, respectively. This problem has been solved rigorously in several cases but, until now, only for relatively small data sets. …”
Section: Introductionmentioning
confidence: 99%
“…Recent years have seen the publication of a plethora of QSPR methods for the prediction of boiling point, and it is impracticable to cover all of these in a review of this nature. Table 1 lists those from 1996 onwards [28–83]. Notice that many of the studies deal with specific classes of compounds, especially alkanes.…”
Section: Boiling Pointmentioning
confidence: 99%