In this short note we consider the differentiable evaluation of the objective function of the sampling design optimization problem based on the inverse of the Fisher information matrix, and where the integer design variables have been converted into real variables using a relaxation technique. To ensure differentiability and cover the full range of the variables, and thus improve the convergence behavior of derivative-based optimization algorithms, we propose applying a Cholesky decomposition on the Fisher information matrix, but using a special higher precision floating point arithmetic to ensure stability. While each evaluation of the functional becomes slower, the algorithm is much simpler, amenable to be used with automatic differentiation directly, and can be shown to be very stable. For many practical situations, this is a valuable trade-off.
Problem formulationSuppose you want to estimate the parameters p of a time dependent differential equation model of some phenomenon,by carrying out an experiment. Depending on the situation, measurements might be expensive, and thus there is value in choosing from a predetermined set of candidate measurement times an optimal subset from which to obtain reliable estimates. An approach that works in practice is: assume that you know the correct parameters (this will be the initial guess p 0 for the estimation). Then, perform simulated measurements at the candidate time-points t 0 ≤ t 1 < t 2 < . . . < t N ≤ t f , and set up the expected least-squares estimation problem,where h is the measurement function (for space and simplicity reasons assumed scalar in this note), the variance of the measurement error is assumed to be 1 for simplicity. The w i are weights we give different hypothetical measurements, and are the degrees of freedom subject to optimization in the classical experiment design problem [1]. Ideally, the w i should be in {0, 1}, we consider them in [0, 1] to obtain the so-called relaxed formulation.The variance-covariance matrix associated to the least squares problem is, for a vector of weights w ∈ R N , given by M (w) = F (w) −1 = (J(w) T J(w)) −1 , where the i-th row of J(w) is √ w i ∂ p h(y(t i ; p)). A typical sampling design problem then looks as follows, where we use the trace of the variance-covariance matrix as a quality measure (the so-called A-criterion),where w max denotes the maximum number of measurements to be taken. Such problems can be solved reliably and efficiently using SQP methods [2], and lead to more complex problems if there are external controls on the differential equation [3,4]. The issue we will address in what follows is the numerical evaluation of the functional Ψ(w) := Tr(M(w)) and its derivatives.
Numerical evaluation of the sampling design objective functionOne immediately realizes that the presence of the squared matrix induces stability issues. The usual way of computing M (w) is to compute the thin QR decomposition of J(w) and set M (w) = R(w) −1 R(w) −T , a formula which usues the fact that Q(w) T Q(w) = I. This is the f...