A jack-knife based method for variable selection in partial least squares regression is presented. The method is based on significance tests of model parameters, in this paper applied to regression coefficients. The method is tested on a near infrared (NIR) spectral data set recorded on beer samples, correlated to extract concentration and compared to other methods with known merit. The results show that the jack-knife based variable selection performs as well or better than other variable selection methods do. Furthermore, results show that the method is robust towards various cross-validation schemes (the number of segments and how they are chosen).
In this paper, we present an approach for incorporating chemical band assignment information in regression models between spectra and constituents. It is shown how the matrices in this L-shaped data structure can be combined and give direct information of the relationships between theoretical chemical band assignment, spectral wavelengths and the responses. The chosen application is NIR spectroscopic measurements of canola seeds. Variable selection based on partial least squares regression using jack-knifing within a cross-model validation (CMV) framework is applied for removing non-relevant spectral regions. Extended multiplicative scatter correction was applied as a spectral pre-treatment to remove physical scatter effects in the spectra. The results show a high degree of correspondence between the objectively found wavelength bands from CMV and the reported chemical interpretation found in the literature.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.