2017
DOI: 10.1002/cem.2922
|View full text |Cite
|
Sign up to set email alerts
|

Ensemble partial least squares regression for descriptor selection, outlier detection, applicability domain assessment, and ensemble modeling in QSAR/QSPR modeling

Abstract: In QSAR/QSPR modeling, building an accurate partial least squares (PLS) model usually involves descriptor selection, outlier detection, applicability domain assessment, nonlinear relationship, and model stability problems. In the present study, we presented an ensemble PLS (EnPLS) method for solving these modeling tasks under a unified methodology framework. EnPLS aims at developing a consistent algorithmic framework by means of the idea of ensemble learning and statistical distribution. The approach exploits … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
18
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
1
1

Relationship

0
7

Authors

Journals

citations
Cited by 28 publications
(18 citation statements)
references
References 80 publications
0
18
0
Order By: Relevance
“…The “enpls” argument in the enpls package was used to conduct partial least squares regression on 500 Monte Carlo experiments with a sampling ratio of 0.8. As the determined effects of independent variables to any model are dependent upon the cases included in analyses, ensemble learning methods may improve prediction accuracy and stability of regression models by exploiting the statistical distribution of variable coefficients and prediction errors . K‐fold cross‐validation was used to examine the stability of the ensemble partial least squares model using the “cv.enpls()” argument …”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…The “enpls” argument in the enpls package was used to conduct partial least squares regression on 500 Monte Carlo experiments with a sampling ratio of 0.8. As the determined effects of independent variables to any model are dependent upon the cases included in analyses, ensemble learning methods may improve prediction accuracy and stability of regression models by exploiting the statistical distribution of variable coefficients and prediction errors . K‐fold cross‐validation was used to examine the stability of the ensemble partial least squares model using the “cv.enpls()” argument …”
Section: Methodsmentioning
confidence: 99%
“…Variable coefficients and importance scores were calculated using the “fs.enpls()” function in the enpls package . The vector of regression coefficients across multiple Monte Carlo experiments were used to indicate predictor importance because it is a single measure of association between the predictor variables and the response .…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…The advantage of this method is that while the independent variables may be collinear the derived x-components used in regression will be independent of one another (Mevik & Wehrens, 2007). The ensemble learning method is also advantageous as it exploits the distribution in prediction errors and regression coefficients within a dataset to improve prediction accuracy and stability of regression (Cao et al, 2017).…”
Section: Discussionmentioning
confidence: 99%