2012
DOI: 10.3390/molecules171214937
|View full text |Cite
|
Sign up to set email alerts
|

QSPR Models for Predicting Log Pliver Values for Volatile Organic Compounds Combining Statistical Methods and Domain Knowledge

Abstract: Volatile organic compounds (VOCs) are contained in a variety of chemicals that can be found in household products and may have undesirable effects on health. Thereby, it is important to model blood-to-liver partition coefficients (log Pliver) for VOCs in a fast and inexpensive way. In this paper, we present two new quantitative structure-property relationship (QSPR) models for the prediction of log Pliver, where we also propose a hybrid approach for the selection of the descriptors. This hybrid methodology com… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
14
0

Year Published

2015
2015
2022
2022

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 14 publications
(14 citation statements)
references
References 31 publications
0
14
0
Order By: Relevance
“…This process encompasses several aspects, such as analyzing descriptor co-occurrence in the different candidate models, avoiding redundant descriptor sets, and analyzing descriptor–target relationships. This analysis has been traditionally carried out by the expert combining her expertise with the design of plots and tables in an ad - hoc manner [ 10 , 11 ].…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…This process encompasses several aspects, such as analyzing descriptor co-occurrence in the different candidate models, avoiding redundant descriptor sets, and analyzing descriptor–target relationships. This analysis has been traditionally carried out by the expert combining her expertise with the design of plots and tables in an ad - hoc manner [ 10 , 11 ].…”
Section: Methodsmentioning
confidence: 99%
“…At this point, the modeler can have a better understanding of the contribution of each descriptor to the modeling of the target property and the type of relationship (linear, quadratic, cubic, etc.). For example, in previous works [ 11 , 26 ], the best models were obtained by combining descriptors that cover different subregions of the chemical domain. In other words, the exploration of this visualization allows assessing the contribution of the different descriptors to the model.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…A DT is composed of a hierarchical arrangement of nodes and branches in which the nodes represent the molecular descriptors, whereas the branches refer to decision rules to categorize compounds as actives and inactives. DT has been successfully applied in the analysis of various types of compounds, such as aromatase inhibitors, 26 volatile organic compounds, 31 and cytochrome P450-interacting compounds. 32 A DT was constructed with WEKA, version 3.6, 33 using the J48 algorithm (a Java implementation of the C4.5 algorithm).…”
Section: Methodsmentioning
confidence: 99%
“…Assessing each of these meta-models with different random seed numbers slightly decreased the performance for one of them and increased it for the other (mean balanced accuracies for five repeated runs with different seed numbers were 73.56% and 74.27%, respectively; standard deviations 0.47% and 0.44%). The inclusion of dose among the predictors in the meta-models only slightly (if at all) increased the performance compared with the meta-models built without the dose, but we preferred to include it on the basis of domain knowledge [22,23]. Meta-models built similarly with support vector machines (SVM), k-nearest neighbors (knn and its Rweka implementation, IBk) and naïve Bayes algorithms had a slightly lower performance in terms of both balanced accuracy and sensitivity than those built with random forests.…”
Section: Performances Of Modelsmentioning
confidence: 99%