In the present work sparse-based methods are applied to the analysis of hyperspectral images with the aim at studying their capability of being adequate methods for variable selection in a classification framework. The key aspect of sparse methods is the possibility of performing variable selection by forcing the model coefficients related to irrelevant variables to zero. In particular, two different sparse classification approaches, i.e. sPCA+kNN and sPLS-DA, were compared with the corresponding classical methods (PCA + kNN and PLS-DA) to classify Arabica and Robusta coffee species. Green coffee samples were analyzed using near infrared hyperspectral imaging and the average spectra from each hyperspectral image were used to build training and test sets; furthermore a test image was used to evaluate the performances of the considered methods at pixel-level. In our case, sparse methods led to similar results as classical methods, with the advantage of obtaining more interpretable and parsimonious models. An important result to highlight is that variable selection performed with two different sparse classification approaches converged to the selection of same spectral regions, which implies the chemical relevance of those regions in the discrimination of Arabica and Robusta coffee species
Hyperspectral imaging allows to easily acquire tens of thousands of spectra for a single sample in few seconds; though valuable, this data-richness poses many problems due to the difficulty of handling a representative amount of samples altogether. For this reason, we recently proposed an approach based on the idea of reducing each image into a one-dimensional signal, named hyperspectrogram, which accounts both for spatial and for spectral information. In this manner, a dataset of hyperspectral images can be easily and quickly converted into a set of signals (2D data matrix), which in turn can be analyzed using classical chemometric techniques. In this work, the hyperspectrograms obtained from a dataset of 800 NIR-hyperspectral images of two different apple varieties were used to discriminate bruised from sound apples using iPLS-DA as variable selection algorithm, which allowed to efficiently detect the presence of bruises. Moreover, the reconstruction as images of the selected variables confirmed that the automated procedure led to the exact identification of the spatial features related to the onset and to the subsequent evolution with time of the bruise defect
Hyperspectral sensors represent a powerful tool for chemical mapping of solid-state samples, since they provide spectral information localized in the image domain in very short times and without the need of sample pretreatment. However, due to the large data size of each hyperspectral image, data dimensionality reduction (DR) is necessary in order to develop hyperspectral sensors for real-time monitoring of large sets of samples with different characteristics. In particular, in this work, we focused on DR methods to convert the three-dimensional data array corresponding to each hyperspectral image into a one-dimensional signal (1D-DR), which retains spectral and/or spatial information. In this way, large datasets of hyperspectral images can be converted into matrices of signals, which in turn can be easily processed using suitable multivariate statistical methods. Obviously, different 1D-DR methods highlight different aspects of the hyperspectral image dataset. Therefore, in order to investigate their advantages and disadvantages, in this work, we compared three different 1D-DR methods: average spectrum (AS), single space hyperspectrogram (SSH) and common space hyperspectrogram (CSH). In particular, we have considered 370 NIR-hyperspectral images of a set of green coffee samples, and the three 1D-DR methods were tested for their effectiveness in sensor fault detection, data structure exploration and sample classification according to coffee variety and to coffee processing method. Principal component analysis and partial least squares-discriminant analysis were used to compare the three separate DR methods. Furthermore, low-level and mid-level data fusion was also employed to test the advantages of using AS, SSH and CSH altogether. Graphical Abstract Key steps in hyperspectral data dimenionality reduction.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.