Background
Estimation of the global optima of multiple model parameters is valuable for precisely extracting parameters that characterize a physical environment. This is especially useful for imaging purposes, to form reliable, meaningful physical images with good reproducibility. However, it is challenging to avoid different local minima when the objective function is nonconvex. The problem of global searching of multiple parameters was formulated to be a
k
-D move in the parameter space and the parameter updating scheme was converted to be a state-action decision-making problem.
Methods
We proposed a novel Deep Q-learning of Model Parameters (DQMP) method for global optimization which updated the parameter configurations through actions that maximized the Q-value and employed a Deep Reward Network (DRN) designed to learn global reward values from both visible fitting errors and hidden parameter errors. The DRN was constructed with Long Short-Term Memory (LSTM) layers followed by fully connected layers and a rectified linear unit (ReLU) nonlinearity. The depth of the DRN depended on the number of parameters. Through DQMP, the
k
-D parameter search in each step resembled the decision-making of action selections from 3
k
configurations in a
k
-D board game.
Results
The DQMP method was evaluated by widely used general functions that can express a variety of experimental data and further validated on imaging applications. The convergence of the proposed DRN was evaluated, which showed that the loss values of six general functions all converged after 12 epochs. The parameters estimated by the DQMP method had relative errors of less than 4% for all cases, whereas the relative errors achieved by Q-learning (QL) and the Least Squares Method (LSM) were 17% and 21%, respectively. Furthermore, the imaging experiments demonstrated that the imaging of the parameters estimated by the proposed DQMP method were the closest to the ground truth simulation images when compared to other methods.
Conclusions
The proposed DQMP method was able to achieve global optima, thus yielding accurate model parameter estimates. DQMP is promising for estimating multiple high-dimensional parameters and can be generalized to global optimization for many other complex nonconvex functions and imaging of physical parameters.