2019
DOI: 10.1109/access.2019.2954115
|View full text |Cite
|
Sign up to set email alerts
|

An Variable Selection Method of the Significance Multivariate Correlation Competitive Population Analysis for Near-Infrared Spectroscopy in Chemical Modeling

Abstract: The high dimensionality of spectral datasets makes it difficult to select the optimal subset of variables. This paper presents a new method for variable selection called the significant multivariate competitive population analysis (SMCPA), Which combines ideas of significant multivariate correlation (SMC) and model population analysis, and employs weighted bootstrap sampling (WBS) and exponential decline function (EDF) competition methods. In this study, the values of SMC distributions are used as an index for… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 8 publications
(3 citation statements)
references
References 51 publications
0
3
0
Order By: Relevance
“…The value of RMSEC for BOSS-IRVS with six added variables has low RMSEC and high RMSEP compared with the BOSS-IRVS with three added variables. The variables around 1104-1400nm can be selected by all methods which indicate the importance of this region which corresponds to the first overtone of the O-H stretch bond vibration [7]. The VCPA-IRIV select other variables in intervals around 1800 and between 2200 and 2400.…”
Section: Wheat Protein Datasetmentioning
confidence: 99%
See 1 more Smart Citation
“…The value of RMSEC for BOSS-IRVS with six added variables has low RMSEC and high RMSEP compared with the BOSS-IRVS with three added variables. The variables around 1104-1400nm can be selected by all methods which indicate the importance of this region which corresponds to the first overtone of the O-H stretch bond vibration [7]. The VCPA-IRIV select other variables in intervals around 1800 and between 2200 and 2400.…”
Section: Wheat Protein Datasetmentioning
confidence: 99%
“…Second, fisher optimal subspace shrinkage (FOSS) [6] that splits variables into some intervals by the information from regression coefficients PLS model, then the weighted block bootstrap sampling (WBBS) is used to select intervals, and the mean of the absolute values of regression coefficients of the corresponding interval determines the weights of sub-intervals. Third, significant multivariate competitive population analysis (SMCPA) that combines the ideas of substantial multivariate correlation (SMC) and MPA, and employs WBS is an improved version of bootstrap sampling with different weights on sampling objects and exponential decline function (EDF) competition method used to force the elimination of uninformative or redundancy variables [7]. For corn and wheat protein datasets, both methods select informative intervals including the BOSS.…”
Section: Introductionmentioning
confidence: 99%
“…In recent years, model population analysis (MPA) has become very popular in the study of variable selection methods. 13,14 Various variable selection approaches based on the MPA algorithm have been developed, including bootstrapping soft shrinkage (BOSS), 15 stabilized bootstrapping soft shrinkage approach (SBOSS), 16 BOSS and interval random variable selection(BOSS-IRVS), 17 variable iterative space shrinkage approach (VISSA), 18 interval combination optimization (ICO), 19 significant multivariate competitive population analysis (SMCPA), 20 etc. These methods construct multiple models by using different subsets of variables and select the best combination of variables by comparing the performance of the models.…”
Section: Introductionmentioning
confidence: 99%