Estimation of unknown dynamics is what system identification is about and a core problem in adaptive control and adaptive signal processing. It has long been known that regularization can be quite beneficial for general inverse problems of which system identification is an example. But only recently, partly under the influence of machine learning, the use of well tuned regularization for estimating linear dynamical systems has been investigated more seriously. In this presentation we review these new results and discuss what they may mean for the theory and practice of dynamical model estimation in general. Estimation is about information in data. It is the question of squeezing out all relevant information in the data. ... but not more. In addition to relevant information, all measured data contain also irrelevant informationmisinformation. In engineering we talk about signal and noise. To handle the information without getting fooled by the misinformation it is necessary to meet the data with some kind of prejudice.Noise -Model Structure A typical prejudice that is used when building a model from data from a system, is that "nature is simple": it should be possible to describe the system with model with some simple "structure", it should belong to a set of models with restricted complexity, or has smooth responses in some sense. This should put a restriction on how much unstructured noise may affect the model.
Variance -BiasIt is important to realize that the error in an estimated model has two sources: (1) We may have used too much constraints and restrictions; "too simple model sets". This gives rise to a bias error or systematic error. (2) The data is corrupted by noise, which gives rise to a variance error or random error. To minimize the MSE is a trade off in constraining the model: A flexible model gives small bias (easier to describe complex behaviour) and large variance (with a flexible model it is easier to get fooled by the noise), and vice versa.This trade-off is at the heart of all estimation problems.Data Fit -Regularization So, we should keep a keen eye on both how well the model is capable to reproduce the measured data and the complexity of the model. (2) This codifies the basic elements in estimation. A very common way to handle the flexibility constraint is to simply restrict the model class. If an explicit penalty is added, this is known as regularization.
A FORMAL CRITERIONWe shall here give a more specific version of (2): A model structure M is a parameterized collection of models that describe the relations between the input and output signal of the system. The parameters are denoted by θ so M(θ) is a particular model. The model set then is M = {M(θ)|θ ∈ D M } (3) That model gives a rule to predict (one-step-ahead) the output at time t, i.e. y(t), based on observations of previous input-output data up to time t − 1 (denoted by Z t−1 ).ŷ (t|θ) = g(t, θ, Z t−1 ) (4) It is natural to compare the model predicted values (4) with the actual outputs and form the prediction errors ε(t, ...