This study studies the online adaptive optimal control problems for a class of continuous-time Markov jump linear systems (MJLSs) based on a novel policy iteration algorithm. By utilising a new decoupling technique named subsystems transformation, the authors re-construct the MJLSs and a set of new coupled systems composed of N subsystems are obtained. The online policy iteration algorithm was used to solve the coupled algebraic matrix Riccati equations with partial knowledge regarding to the system dynamics, and the relevant optimal controllers equivalent to the investigated MJLSs are designed. Moreover, the convergence of the novel policy iteration algorithm is also established. Finally, a simulation example is given to illustrate the effectiveness and applicability of the proposed approach.
IntroductionOver the past few decades, reinforcement learning (RL) and approximate/adaptive dynamic programming (ADP) have been widely applied for solving the optimal control problems for linear/non-linear systems with unknown or uncertain parametric models [1]. RL means finding an admissible control policy, that is, learning the parameters of a controller mapping between the system states and the control signal, such that to maximise a numerical reward signal [2]. It is noted that the integral over time of the reward signal can be viewed as the value/cost function to be maximised/minimised in an optimal control framework. From the viewpoint of control engineering, RL algorithms can be viewed as a class of adaptive controllers which solve the optimal control problem based on reward information which gives information on the performance of a given controller [3,4]. One class of RL algorithms is policy iteration (PI) which was first introduced in the computational intelligence community in the framework of stochastic decision theory [5]. PI algorithms iterate on two-steps named 'policy evaluation', in which the cost (i.e. predefined 'value function') associated with an admissible control policy is evaluated, and 'policy improvement', when the policy is updated such that will have a lower associated cost. The twosteps are forced to repeat until the policy improvement step no longer changes the present policy, thus convergence to the optimal controller is achieved [6,7].Recently, a new RL method, namely integral reinforcement learning (IRL), was provided to learn solution online to optimal control problem without requiring knowledge of the system drift dynamics. In [8], the authors employed this approach to solve the optimal control problem for continuous-time linear timeinvariant systems using partial knowledge regarding to the system dynamics [8]. Thereafter the approach was then extended to similar systems with completely unknown system dynamics in [9]. For an overview of contributions for linear systems, see [10,11]. The extension to the non-linear field was given by the authors [12][13][14][15][16].In the work of [16], the authors studied an online algorithm based on PI for learning the continuous-time optimal control solu...
providing relevant details, so we can investigate your claim. Download date:03. Nov. 2020 Abstract-In this paper, an online adaptive optimal control problem of a class of continuous-time Markov jump linear systems (MJLSs) is investigated by using a parallel reinforcement learning (RL) algorithm with completely unknown dynamics. Before collecting and learning the subsystems information of states and inputs, the exploration noise is firstly added to describe the actual control input. Then, a novel parallel RL algorithm is used to parallelly compute the corresponding N coupled algebraic Riccati equations (AREs) by online learning. By this algorithm, we will not need to know the dynamic information of the MJLSs. The convergence of the proposed algorithm is also proved. Finally, the effectiveness and applicability of this novel algorithm is illustrated by two simulation examples. Index Terms-Markov jump linear systems (MJLSs); adaptive optimal control; online; reinforcement learning (RL); coupled algebraic Riccati equations (AREs).
This article is devoted to the problem of attitude control for rigid spacecraft subject to multiple disturbances. Due to the limited communication and storage resource, an event-triggered anti-disturbance attitude control approach is proposed for the attitude control system (ACS) with multiple disturbances. Different from some existing event-triggered attitude control methods, the multiple disturbances are considered and can be divided into an uncertain modeled disturbance and a norm bounded equivalent disturbance. A disturbance observer is designed to estimate the modeled disturbance. Based on the framework of disturbance observer-based control, the proposed event-triggered anti-disturbance controller can guarantee that the ACS converges toward a small invariant set, which successfully avoids continuous communication and Zeno phenomenon. Finally, simulation results are given to demonstrate the efficiency of the proposed approach.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.