Implementation of algorithms for tuning parameters in regularized least squares problems in system identification

Chen, Tianshi; Ljung, Lennart

doi:10.1016/j.automatica.2013.03.030

Cited by 124 publications

(78 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…which reduces the computational load to O(Nm 2 ), see Chen and Ljung (2013) for other implementation details. Many efficient approximations of the marginal likelihood for the general case have been also developed, see , Lázaro-Gredilla, Quiñonero-Candela, Rasmussen, and FigueirasVidal (2010), Quiñonero-Candela and Rasmussen (2005) and references therein.…”

Section: Marginal Likelihood Optimizationmentioning

confidence: 99%

See 1 more Smart Citation

Kernel methods in system identification, machine learning and function estimation: A survey

et al. 2014

Self Cite

View full text Add to dashboard Cite

a b s t r a c tMost of the currently used techniques for linear system identification are based on classical estimation paradigms coming from mathematical statistics. In particular, maximum likelihood and prediction error methods represent the mainstream approaches to identification of linear dynamic systems, with a long history of theoretical and algorithmic contributions. Parallel to this, in the machine learning community alternative techniques have been developed. Until recently, there has been little contact between these two worlds. The first aim of this survey is to make accessible to the control community the key mathematical tools and concepts as well as the computational aspects underpinning these learning techniques. In particular, we focus on kernel-based regularization and its connections with reproducing kernel Hilbert spaces and Bayesian estimation of Gaussian processes. The second aim is to demonstrate that learning techniques tailored to the specific features of dynamic systems may outperform conventional parametric approaches for identification of stable linear systems.

show abstract

Section: Marginal Likelihood Optimizationmentioning

confidence: 99%

“…Efficient numerical implementation of this minimization problem is discussed in and Chen and Ljung (2013). Once η is estimated, the impulse response can be computed by (19) with γ = σ 2 .…”

Section: Fir Modelsmentioning

confidence: 99%

Kernel methods in system identification, machine learning and function estimation: A survey

et al. 2014

Self Cite

View full text Add to dashboard Cite

show abstract

“…Some theoretical results that assess robustness of this class of kernels are described in [Aravkin et al, 2014, Carli et al, 2012a. Efficient numerical implementations are discussed in [Carli et al, 2012b, Chen andLjung, 2013]. In this paper we concentrate on first-order stable spline kernels (see and also [Chen et al, 2012], where this class of kernels has also been introduced by using a totally different, deterministic argument).…”

Section: Introductionmentioning

confidence: 99%

On the maximum entropy property of the first-order stable spline kernel and its implications

Carli

2014

2014 IEEE Conference on Control Applications (CCA)

View full text Add to dashboard Cite

A new nonparametric approach for system identification has been recently proposed where the impulse response is seen as the realization of a zero-mean Gaussian process whose covariance, the so-called stable spline kernel, guarantees that the impulse response is almost surely stable. Maximum entropy properties of the stable spline kernel have been pointed out in the literature. In this paper we provide an independent proof that relies on the theory of matrix extension problems in the graphical model literature and leads to a closed form expression for the inverse of the first order stable spline kernel as well as to a new factorization in the form U W U ⊤ with U upper triangular and W diagonal. Interestingly, all first-order stable spline kernels share the same factor U and W admits a closed form representation in terms of the kernel hyperparameter, making the factorization computationally inexpensive. Maximum likelihood properties of the stable spline kernel are also highlighted. These results can be applied both to improve the stability and to reduce the computational complexity associated with the computation of stable spline estimators.

show abstract

“…Efficient numerical implementation of this minimization problem is discussed in Chen and Ljung [2013] and Carli et al [2012].…”

Section: Linear Regressionmentioning

confidence: 99%

“…Most of the time for a tuned regularization will anyway lie in the updates of the hyperparameters (29), which is solved by a Gauss-Newton search algorithm, Chen and Ljung [2013]. To do that adaptively at time t, it is natural to form the criterion W (Y |α) with updated and suitably time-weighted observations, and perform just one minimization iteration starting from the current hyperparameter estimateα(t − 1) to determineα(t).…”

Section: Issues Of Recursiveness and Adaptivitymentioning

confidence: 99%

What can regularization offer for estimation of dynamical systems?

Chen

2013

IFAC Proceedings Volumes

View full text Add to dashboard Cite

Estimation of unknown dynamics is what system identification is about and a core problem in adaptive control and adaptive signal processing. It has long been known that regularization can be quite beneficial for general inverse problems of which system identification is an example. But only recently, partly under the influence of machine learning, the use of well tuned regularization for estimating linear dynamical systems has been investigated more seriously. In this presentation we review these new results and discuss what they may mean for the theory and practice of dynamical model estimation in general. Estimation is about information in data. It is the question of squeezing out all relevant information in the data. ... but not more. In addition to relevant information, all measured data contain also irrelevant informationmisinformation. In engineering we talk about signal and noise. To handle the information without getting fooled by the misinformation it is necessary to meet the data with some kind of prejudice.Noise -Model Structure A typical prejudice that is used when building a model from data from a system, is that "nature is simple": it should be possible to describe the system with model with some simple "structure", it should belong to a set of models with restricted complexity, or has smooth responses in some sense. This should put a restriction on how much unstructured noise may affect the model. Variance -BiasIt is important to realize that the error in an estimated model has two sources: (1) We may have used too much constraints and restrictions; "too simple model sets". This gives rise to a bias error or systematic error. (2) The data is corrupted by noise, which gives rise to a variance error or random error. To minimize the MSE is a trade off in constraining the model: A flexible model gives small bias (easier to describe complex behaviour) and large variance (with a flexible model it is easier to get fooled by the noise), and vice versa.This trade-off is at the heart of all estimation problems.Data Fit -Regularization So, we should keep a keen eye on both how well the model is capable to reproduce the measured data and the complexity of the model. (2) This codifies the basic elements in estimation. A very common way to handle the flexibility constraint is to simply restrict the model class. If an explicit penalty is added, this is known as regularization. A FORMAL CRITERIONWe shall here give a more specific version of (2): A model structure M is a parameterized collection of models that describe the relations between the input and output signal of the system. The parameters are denoted by θ so M(θ) is a particular model. The model set then is M = {M(θ)|θ ∈ D M } (3) That model gives a rule to predict (one-step-ahead) the output at time t, i.e. y(t), based on observations of previous input-output data up to time t − 1 (denoted by Z t−1 ).ŷ (t|θ) = g(t, θ, Z t−1 ) (4) It is natural to compare the model predicted values (4) with the actual outputs and form the prediction errors ε(t, ...

show abstract

Implementation of algorithms for tuning parameters in regularized least squares problems in system identification

Cited by 124 publications

References 17 publications

Kernel methods in system identification, machine learning and function estimation: A survey

Kernel methods in system identification, machine learning and function estimation: A survey

On the maximum entropy property of the first-order stable spline kernel and its implications

What can regularization offer for estimation of dynamical systems?

Contact Info

Product

Resources

About