A new graphically oriented local modeling procedure called interval partial least-squares ( iPLS) is presented for use on spectral data. The iPLS method is compared to full-spectrum partial least-squares and the variable selection methods principal variables (PV), forward stepwise selection (FSS), and recursively weighted regression (RWR). The methods are tested on a near-infrared (NIR) spectral data set recorded on 60 beer samples correlated to original extract concentration. The error of the full-spectrum correlation model between NIR and original extract concentration was reduced by a factor of 4 with the use of iPLS ( r = 0.998, and root mean square error of prediction equal to 0.17% plato), and the graphic output contributed to the interpretation of the chemical system under observation. The other methods tested gave a comparable reduction in the prediction error but suffered from the interpretation advantage of the graphic interface. The intervals chosen by iPLS cover both the variables found by FSS and all possible combinations as well as the variables found by PV and RWR, and iPLS is still able to utilize the first-order advantage.
It is nowadays widely accepted that genetic algorithms (GAs) are powerful tools in variable selection and that after suitable modifications they can also be powerful in detecting the most relevant spectral regions for multivariate calibration. One of the main limitations of GAs is related to the fact that when spectral intensities are measured at a very large number of wavelengths the search domain increases correspondingly and therefore the detection of the relevant regions is much more difficult. A modification of interval partial least squares (iPLS), designated backward interval PLS (biPLS), is developed and studied such that it can detect and remove the least relevant regions, thereby reducing the search domain to a size that GAs can handle easily. In this paper the application to two different spectroscopic data sets will be shown: infrared spectroscopic analysis of polymer film additives and determination of the contents of erucic acid and total fatty acids in brassica seeds by near-infrared spectroscopy. The developed method is compared with model performances based on expert selection of variables as well as with results from application of the previously developed GA-PLS method. The sequential application of biPLS and GA-PLS has proven successful, and comparable or better results have been obtained, introducing a more automatic region selection procedure and a substantial decrease in computation time.
In this study, near-infrared (NIR) transmittance and Raman spectroscopy chemometric calibrations of the active substance content of a pharmaceutical tablet were developed using partial least-squares regression (PLS). Although the active substance contained the strongly Raman active C≡N functional group, the best results were obtained with NIR transmittance, which highlights the difference between (microscopic) surface sampling and whole tablet diffuse transmittance sampling. The tablets exist in four dosages with only two different concentrations of active substance (5 mg (5.6% w/w), and 10, 15, and 20 mg (8.0% w/w) active substance per tablet). A calibration on all four dosages resulted in a prediction error expressed as the root mean squared error of cross-validation (RMSECV) of 0.30% w/w for the NIR transmittance calibration. The corresponding error when using Raman spectra was 0.56% w/w. Specially prepared calibration batches covering the range 85–115% of the nominal content for each dosage were added to the first sample set, and NIR transmittance calibrations on this set—containing coated as well as uncoated tablets—gave a further reduction in prediction errors to 0.21–0.289% w/w. This corresponds to relative prediction errors (RMSECV/ynom) of 2.6–3.7%. This is a reasonably low error when compared to the error of the chromatographic reference method, which was estimated to 3.5%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.