2010
DOI: 10.1002/minf.201000001
|View full text |Cite
|
Sign up to set email alerts
|

Application of Random Forest and Multiple Linear Regression Techniques to QSPR Prediction of an Aqueous Solubility for Military Compounds

Abstract: The relationship between the aqueous solubility of more than two thousand eight hundred organic compounds and their structures was investigated using a QSPR approach based on Simplex Representation of Molecular Structure (SiRMS). The dataset consists of 2537 diverse organic compounds. Multiple Linear Regression (MLR) and Random Forest (RF) methods were used for statistical modeling at the 2D level of representation of molecular structure. Statistical characteristics of the best models are quite good (MLR metho… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
35
0
1

Year Published

2012
2012
2024
2024

Publication Types

Select...
5
1

Relationship

1
5

Authors

Journals

citations
Cited by 46 publications
(36 citation statements)
references
References 46 publications
0
35
0
1
Order By: Relevance
“…Recent reviews provide the state‐of‐art analysis of QSPR applicability in terms of aqueous solubility description and prediction. It was highlighted that current tendency in QSPR models development is to create models capable of describing and predicting solubility of large sets of structurally diverse compounds.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…Recent reviews provide the state‐of‐art analysis of QSPR applicability in terms of aqueous solubility description and prediction. It was highlighted that current tendency in QSPR models development is to create models capable of describing and predicting solubility of large sets of structurally diverse compounds.…”
Section: Methodsmentioning
confidence: 99%
“…Recently, we have investigated aqueous solubility of military‐relevant compounds using QSPR approach . In addition, QSPR analysis of aqueous solubility of more than 2500 organic compounds which belong to different classes and the influence of salinity on solubility was the subject of other publications . However, all models available in the literature for aqueous solubility prediction at QSPR level suffer from a serious limitation.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…This eminent technique and potent method can diminish the variance and improve the prediction accuracy. RF has been utilized as a classification and regression method in several biological studies . Both the RF classification and RF regression procedures can identify important features useful in either elucidating the variation in the outputs of interest or classifying the results in to different groups.…”
Section: Introductionmentioning
confidence: 99%
“…Any compounds whose bioavailability is strongly affected by the dose and formulation was excluded from the dataset. Random forest method [61,62] was used for the development of QSPR classification models. All compounds were divided into three (high, medium and low bioavailability, Table 14.1) or two (high and acceptable, Table 14.2) classes.…”
Section: Qspr Prediction Of the Drugs Bioavailability On The Base Of mentioning
confidence: 99%