A variable informative criterion based on weighted voting strategy combined with LASSO for variable selection in multivariate calibration

Zhang, Ruoqiu; Zhang, Feiyu; Chen, Wanchao; Xiong, Qin; Chen, Zengkai; Yao, Heming; Ge, Jiong; Hu, Yun; Du, Yukou

doi:10.1016/j.chemolab.2018.11.015

Cited by 15 publications

(3 citation statements)

References 53 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Despite the numerous proposed weight assignment methods, finding the suitable weight configuration remains a challenging task. At present, the most common practice is to assign weights according to the prediction accuracy of the predictor [ 36 , 37 ]. However, when the prediction accuracy gap between the predictors is too large, this method cannot guarantee the integrated results better than the results of a single predictor.…”

Section: Introductionmentioning

confidence: 99%

Combination prediction method of students’ performance based on ant colony algorithm

Xu,

Kim

2024

PLoS ONE

View full text Add to dashboard Cite

Students’ performance is an important factor for the evaluation of teaching quality in colleges. The prediction and analysis of students’ performance can guide students’ learning in time. Aiming at the low accuracy problem of single model in students’ performance prediction, a combination prediction method is put forward based on ant colony algorithm. First, considering the characteristics of students’ learning behavior and the characteristics of the models, decision tree (DT), support vector regression (SVR) and BP neural network (BP) are selected to establish three prediction models. Then, an ant colony algorithm (ACO) is proposed to calculate the weight of each model of the combination prediction model. The combination prediction method was compared with the single Machine learning (ML) models and other methods in terms of accuracy and running time. The combination prediction model with mean square error (MSE) of 0.0089 has higher performance than DT with MSE of 0.0326, SVR with MSE of 0.0229 and BP with MSE of 0.0148. To investigate the efficacy of the combination prediction model, other prediction models are used for a comparative study. The combination prediction model with MSE of 0.0089 has higher performance than GS-XGBoost with MSE of 0.0131, PSO-SVR with MSE of 0.0117 and IDA-SVR with MSE of 0.0092. Meanwhile, the running speed of the combination prediction model is also faster than the above three methods.

show abstract

Section: Introductionmentioning

confidence: 99%

Combination prediction method of students’ performance based on ant colony algorithm

Xu,

Kim

2024

PLoS ONE

View full text Add to dashboard Cite

show abstract

“…In addition, the joint use of different methods has been applied in variable selection due to the complementarity among different algorithms. For instance, a weighted voting strategy combined with the least absolute shrinkage and selection operator (WV-LASSO), stabilized bootstrapping soft shrinkage approach (SBOSS), two-step hybrid methods (e.g., CARS-SPA), and three-step hybrid methods (e.g., iPLS-VIP-GA) have been designed, and the prediction ability of the joint strategies is better than that of the single variable selection method. In our previous works, several variable selection methods were also proposed, such as influential variables (IV), locally linear embedding (LLE), combination of heuristic optimal partner bands (CVB), and C value, etc.…”

Section: Introductionmentioning

confidence: 99%

Interpretable Perturbator for Variable Selection in near-Infrared Spectral Analysis

Duan,

Liu,

Cai

et al. 2023

J. Chem. Inf. Model.

View full text Add to dashboard Cite

A perturbator was developed for variable selection in near-infrared (NIR) spectral analysis based on the perturbation strategy in deep learning for developing interpretation methods. A deep learning predictor was first constructed to predict the targets from the spectra in the training set. Then, taking the output of the predictor as a reference, the perturbator was trained to derive the perturbation-positive (P + ) and perturbation-negative (P − ) features from the spectra. Therefore, the weight (σ) of the perturbator layer can be a criterion to evaluate the importance of the variables in the spectra. Ranking the spectral variables by the criterion, the number of the variables used in the quantitative model can be obtained through cross-validation. Three NIR data sets were used to evaluate the proposed method. The root mean squared error was found to be comparable with or superior to that obtained by the commonly used methods. Moreover, the selected spectral variables are interpretable in identifying the key spectral features related to the prediction target. Therefore, the proposed method provides not only an effective tool for optimizing quantitative model, but also an efficient way for explaining spectra of multicomponent samples.

show abstract

“…However, most of these applications focus on feature selection; e.g. Lasso has been analyzed as an alternative to conventional feature selection methods for PLS based soft sensor models [22,38,83]. Similarly, prediction potential of RVM, along with its ability to estimate uncertainty in predictions [50], is yet to be exploited in applications to processes.…”

Section: Introductionmentioning

confidence: 99%

An Exploratory Analysis of Biased Learners in Soft-Sensing Frames

Urhan,

Alakent

2019

Preprint

View full text Add to dashboard Cite

Data driven soft sensor design has recently gained immense popularity, due to advances in sensory devices, and a growing interest in data mining. While partial least squares (PLS) is traditionally used in the process literature for designing soft sensors, the statistical literature has focused on sparse learners, such as Lasso and relevance vector machine (RVM), to solve the high dimensional data problem. In the current study, predictive performances of three regression techniques, PLS, Lasso and RVM were assessed and compared under various offline and online soft sensing scenarios applied on datasets from five real industrial plants, and a simulated process. In offline learning, predictions of RVM and Lasso were found to be superior to those of PLS when a large number of time-lagged predictors were used. Online prediction results gave a slightly more complicated picture. It was found that the minimum prediction error achieved by PLS under moving window (MW), or just-intime learning scheme was decreased up to ∼5-10% using Lasso, or RVM. However, when a small MW size was used, or the optimum number of PLS components was as low as ∼1, prediction performance of PLS surpassed RVM, which was found to yield occasional unstable predictions. PLS and Lasso models constructed via online parameter tuning generally did not yield better predictions compared to those constructed via offline tuning. We present evidence to suggest that retaining a large portion of the available process measurement data in the predictor matrix, instead of preselecting variables, would be more advantageous for sparse learners in increasing prediction accuracy. As a result, Lasso is recommended as a better substitute for PLS in soft sensors; while performance of RVM should be validated before online application.

show abstract

A variable informative criterion based on weighted voting strategy combined with LASSO for variable selection in multivariate calibration

Cited by 15 publications

References 53 publications

Combination prediction method of students’ performance based on ant colony algorithm

Combination prediction method of students’ performance based on ant colony algorithm

Interpretable Perturbator for Variable Selection in near-Infrared Spectral Analysis

An Exploratory Analysis of Biased Learners in Soft-Sensing Frames

Contact Info

Product

Resources

About