Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time.
With the organic food market on the rise, organic food fraud has become an issue to consumers, producers and the market. Traditional methods of food quality determination are time consuming and require expert laboratory analysis. Recent studies based on spectroscopic analysis have shown its potential effectiveness in non-destructive food analysis. This paper explores the use of low cost Near Infrared Spectroscopy (NIRS) combined with a pattern recognition approach for the differentiation of organic and non-organic apples. The spectra of organic and non-organic Gala apples are measured using a low cost and portable NIR Spectrometer. A pattern recognition pipeline is proposed, where spectra data are pre-processed and then classified into organic and non-organic. Baseline correction and normalization are used in pre-processing, and Partial Least Squares Discriminant Analysis (PLS-DA) is used for classification. The experimental results show that the apple samples can be classified into organic and non-organic ones with accuracies of over 96%. The results and the fact the NIR spectrometer used was low cost and portable suggest this is potentially a cost effective solution to the detection of organic food fraud.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.