In this paper, we describe a robust technique based on the quasi-Newton method (QN) using an adaptive momentum term to train of neural networks. Microwave circuit models have strong nonlinearities and need a robust training algorithm for their neural network models. The robustness here means that practical solutions can be obtained regardless of the initial values. QN-based algorithms are commonly used for these purposes. Nesterov's accelerated quasi-Newton method (NAQ) proposed a way to accelerate of the QN using a fixed momentum coefficient. In this research, we verify the effectiveness of NAQ for microwave circuit modeling with high nonlinearities and propose a robust QNbased training algorithm with an adaptive momentum coefficient. The proposed algorithm is demonstrated through the modeling of a function and two microwave circuit modeling problems.Here, d p , o p , and w ∈ R n are the pth desired, pth output and weight vectors, respectively. T r denotes the training data set
Recent studies incorporate Nesterov's accelerated gradient method for the acceleration of gradient based training. The Nesterov's Accelerated Quasi-Newton (NAQ) method has shown to drastically improve the convergence speed compared to the conventional quasi-Newton method. This paper implements NAQ for non-convex optimization on Tensorflow. Two modifications have been proposed to the original NAQ algorithm to ensure global convergence and eliminate linesearch. The performance of the proposed algorithm -mNAQ is evaluated on standard non-convex function approximation benchmark problems and microwave circuit modelling problems. The results show that the improved algorithm converges better and faster compared to first order optimizers such as AdaGrad, RMSProp, Adam, and the second order methods such as the quasi-Newton method.
This paper describes a momentum acceleration technique for quasi-Newton (QN) based neural network training and verifies its performance and computational complexity. Recently, Nesterov's accelerated quasi-Newton method (NAQ) has been introduced and shown that the momentum term is effective in reducing the number of iterations and the total training time by incorporating Nesterov's accelerated gradient into QN. However, the gradients had to be calculated two times in one iteration in the NAQ training. This increased the computation time of a training loop compared with the conventional QN. The proposed technique is an improvement to NAQ done by approximating the Nesterov's accelerated gradient as a linear combination of the current and previous gradients. As a result, the gradient is calculated only once per iteration similar to that of QN. The performance of the proposed algorithm is evaluated in comparison to conventional algorithms in neural networks training on two types of problems -function approximation problems with high nonlinearity and classification problems. The results show a significant acceleration in the computation time without losing the quality of the solution compared with conventional training algorithms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.