Several variable selection algorithms in multivariate calibration can be accelerated using Graphics Processing Units (GPU). Among these algorithms, the Firefly Algorithm (FA) is a recent proposed metaheuristic that may be used for variable selection. This paper presents a GPU-based FA (FA-MLR) with multiobjective formulation for variable selection in multivariate calibration problems and compares it with some traditional sequential algorithms in the literature. The advantage of the proposed implementation is demonstrated in an example involving a relatively large number of variables. The results showed that the FA-MLR, in comparison with the traditional algorithms is a more suitable choice and a relevant contribution for the variable selection problem. Additionally, the results also demonstrated that the FA-MLR performed in a GPU can be five times faster than its sequential implementation.
Wheat is the third most produced grain in the world after maize and rice. Determining the protein concentration in wheat grain is one of the major challenges for measuring its industrial quality. Samples of wheat can be collected using a spectrophotometer device. The challenge is to associate the energy absorbed by the device with the protein concentration in wheat. The device measures hundreds of variable intensities that can be related to the physicochemical properties. The selection of a subset of uncorrelated variables has been shown to be fundamental for establishing correct correlations and reducing prediction error. A new formulation of a compact genetic algorithm that uses only a mutation operator is proposed. The results produced by the proposed approach are compared with traditional techniques for spectroscopy variable selection as successive projection algorithms, partial least square and classical formulations of genetic algorithms. For near‐infrared spectral analysis of the protein concentration in wheat, the prediction errors decreased from 0.28 to 0.10 on average, a reduction of 63%.
This paper proposes multi-objective genetic algorithm for the problem of variable selection in multivariate calibration. We consider the problem related to the classification of biodiesel samples to detect adulteration, Linear Discriminant Analysis classifier. The goal of the multi--objective algorithm is to reduce the dimensionality of the original set of variables; thus, the classification model can be less sensitive, providing a better generalization capacity. In particular, in this paper we adopted a version of the Non-dominated Sorting Genetic Algorithm (NSGA-II) and compare it to a mono-objective Genetic Algorithm (GA) in terms of sensitivity in the presence of noise. Results show that the mono-objective selects 20 variables on average and presents an error rate of 14%. One the other hand, the multi-objective selects 7 variables and has an error rate of 11%. Consequently, we show that the multi-objective formulation provides classification models with lower sensitivity to the instrumental noise when compared to the mono-objetive formulation.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.