The design of stabilizing controller for uncertain nonlinear systems with control constraints is a challenging problem. The constrained-input coupled with the inability to identify accurately the uncertainties motivates the design of stabilizing controller based on reinforcement-learning (RL) methods. In this paper, a novel RL-based robust adaptive control algorithm is developed for a class of continuous-time uncertain nonlinear systems subject to input constraints. The robust control problem is converted to the constrained optimal control problem with appropriately selecting value functions for the nominal system. Distinct from typical action-critic dual networks employed in RL, only one critic neural network (NN) is constructed to derive the approximate optimal control. Meanwhile, unlike initial stabilizing control often indispensable in RL, there is no special requirement imposed on the initial control. By utilizing Lyapunov's direct method, the closed-loop optimal control system and the estimated weights of the critic NN are proved to be uniformly ultimately bounded. In addition, the derived approximate optimal control is verified to guarantee the uncertain nonlinear system to be stable in the sense of uniform ultimate boundedness. Two simulation examples are provided to illustrate the effectiveness and applicability of the present approach.
In this study, a novel online adaptive dynamic programming (ADP)-based algorithm is developed for solving the optimal control problem of affine non-linear continuous-time systems with unknown internal dynamics. The present algorithm employs an observer-critic architecture to approximate the Hamilton-Jacobi-Bellman equation. Two neural networks (NNs) are used in this architecture: an NN state observer is constructed to estimate the unknown system dynamics and a critic NN is designed to derive the optimal control instead of typical action-critic dual networks employed in traditional ADP algorithms. Based on the developed architecture, the observer NN and the critic NN are tuned simultaneously. Meanwhile, unlike existing tuning laws for the critic, the newly developed critic update rule not only ensures convergence of the critic to the optimal control but also guarantees stability of the closed-loop system. No initial stabilising control is required, and by using recorded and instantaneous data simultaneously for the adaptation of the critic, the restrictive persistence of excitation condition is relaxed. In addition, Lyapunov direct method is utilised to demonstrate the uniform ultimate boundedness of the weights of the observer NN and the critic NN. Finally, an example is provided to verify the effectiveness of the present approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.