Curve fitting is one of the most important methods to extract not only chemical but also biological information in perturbations. 1 By the assumption of proper model function, curve fitting enables one to describe detailed changes caused by perturbation. However, we sometimes experience difficulty in obtaining a desirable fit between observed data and a model function. One of the main reasons for this problem is that all fitting parameters are based on a least squares (LS) estimator. An LS estimator determines all parameters of a model function by minimizing sum of squared residuals between observed and predicted values. Such an estimator is strongly influenced by outliers in a data set. For example, LS estimator has a breakdown value of 0%, namely it is strongly influenced by even only one outlier in data. 2 It is not adequate to apply such curve fitting to a data set with strong noise. In the practice of curve fits for such data including noise, it is more desirable to obtain fitting parameters by some robust estimators that are less influenced by outliers. Therefore, we introduce a new technique for curve-fitting based on a robust estimator, least median squares (LMedS). The LMedS estimator is defined to be the parameter which minimizes the median of the squared residuals between observed and predicted values. The LMedS estimator has a remarkable tolerance toward outliers of 50% in a data set. 3,4 In spite of such theoretical advantage, it requires more computational complexity compared with the LS estimator. Generally, a Monte-Carlo type technique is used to obtain a satisfying approximation of the LMedS estimator because, so far, there are no straightforward formulas for the LMedS estimator.
6Such technique needs many iterations and computational complexity increase with the increase in the number of parameters in the model function. In this study, particle swarm optimization (PSO) is introduced to acquire the LMedS estimator for out curve fits. PSO, developed by Eberhart and Kennedy in 1995, is a stochastic search method inspired by the social behavior of bird flocking. Similar to the genetic algorithm (GA), PSO is a population-based optimization tool that searches for optima by updating generations. [5][6][7][8][9] However, unlike GA, 10 PSO includes no evolution operators such as crossover and mutation. Compared to GA, a remarkable advantage of PSO is that its algorithm is conceptually very simple, computation costs are low and few adjustable parameters are needed.The proposed robust curve fitting method is demonstrated with simulated data and temperature-dependent near-infrared (NIR) spectra of oleic acid (OA). The results show that the proposed method is more effective than a standard curve fit based on the LS estimator when the data set has high level noise.
Theory
LMedS estimatorWe are given a set of n observations in the plane (xi, yi), for i = 1, ... , n, and consider a linear regression model,LMedS can be defined as; A curve fitting technique for optical spectra based on a robust estimator, least ...