Chemometric methods are broadly used in the chemical
and biochemical
sectors. Typically, derivation of a regression model follows data
preprocessing in a sequential manner. Yet, preprocessing can significantly
influence the regression model and eventually its predictive ability.
In this work, we investigate the coupling of preprocessing and model
parameter estimation by incorporating them simultaneously in an optimization
step. Common model selection techniques rely almost exclusively on
the performance of some accuracy metric, yet having a quantitative
metric for model robustness can prolong model up-time. Our approach
is applied to optimize for model accuracy and robustness. This requires
the introduction of a novel mathematical definition for robustness.
We test our method in a simulated set up and with industrial case
studies from multivariate calibration. The results highlight the importance
of both accuracy and robustness properties and illustrate the potential
of the proposed optimization approach toward automating the generation
of efficient chemometric models.