Multi-agent reinforcement learning approach based on reduced value function approximations

Abouheaf, Mohammed I.; Gueaieb, Wail

doi:10.1109/iris.2017.8250107

Cited by 16 publications

(19 citation statements)

References 18 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Typical optimal control methods tend to solve the underlying Hamilton-Jacobi-Bellman (HJB) equation of the dynamical system by applying the optimality principles [22,23]. An optimal control problem is usually formulated as an optimization problem with a cost function that identifies the optimization objectives and a mathematical process to find the respective optimal strategies [6,7,18,[22][23][24][25][26][27][28]. To implement the optimal control solutions stemming from the ADP approaches, numerous solving frameworks are considered based on combinations of Reinforcement Learning (RL) and adaptive critics [1,5,18,25,27].…”

Section: Introductionmentioning

confidence: 99%

“…The sequence of these coupled steps can be implemented with either value or policy iteration method [18]. RL has also been proposed to solve problems with multi-agent structures and objectives [29] as well as cooperative control problems using dynamic graphical games [21,26,30]. Action Dependent Dual Heuristic Dynamic Programming (ADDHP) depends on the system's dynamic model [7,26,28].…”

Section: Introductionmentioning

confidence: 99%

“…RL has also been proposed to solve problems with multi-agent structures and objectives [29] as well as cooperative control problems using dynamic graphical games [21,26,30]. Action Dependent Dual Heuristic Dynamic Programming (ADDHP) depends on the system's dynamic model [7,26,28]. Herein, the relation between the Hamiltonian and Bellman equation is used to solve for the governing costate expressions and hence a policy iteration process is proposed to find an optimal solution.…”

Section: Introductionmentioning

confidence: 99%

“…Herein, the relation between the Hamiltonian and Bellman equation is used to solve for the governing costate expressions and hence a policy iteration process is proposed to find an optimal solution. Dual Heuristic Dynamic Programming (DHP) approaches for graphical games are developed in [21,26,30]. However, these approaches require in-advance knowledge of the system's dynamics and, in some cases of the multi-agent systems, they rely on complicated costate structures to include the neighbors influences.…”

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Model-Free Gradient-Based Adaptive Learning Controller for an Unmanned Flexible Wing Aircraft

2018

Self Cite

View full text Add to dashboard Cite

Classical gradient-based approximate dynamic programming approaches provide reliable and fast solution platforms for various optimal control problems. However, their dependence on accurate modeling approaches poses a major concern, where the efficiency of the proposed solutions are severely degraded in the case of uncertain dynamical environments. Herein, a novel online adaptive learning framework is introduced to solve action-dependent dual heuristic dynamic programming problems. The approach does not depend on the dynamical models of the considered systems. Instead, it employs optimization principles to produce model-free control strategies. A policy iteration process is employed to solve the underlying Hamilton–Jacobi–Bellman equation using means of adaptive critics, where a layer of separate actor-critic neural networks is employed along with gradient descent adaptation rules. A Riccati development is introduced and shown to be equivalent to solving the underlying Hamilton–Jacobi–Bellman equation. The proposed approach is applied on the challenging weight shift control problem of a flexible wing aircraft. The continuous nonlinear deformation in the aircraft’s flexible wing leads to various aerodynamic variations at different trim speeds, which makes its auto-pilot control a complicated task. Series of numerical simulations were carried out to demonstrate the effectiveness of the suggested strategy.

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 2 more Smart Citations

Model-Free Gradient-Based Adaptive Learning Controller for an Unmanned Flexible Wing Aircraft

2018

Self Cite

View full text Add to dashboard Cite

show abstract

“…The optimal control problem finds the necessity optimality conditions and hence the optimal strategies [15]. Reinforcement Learning is used to solve the synchronization control problem online in [16]- [18].…”

Section: Introductionmentioning

confidence: 99%

Reinforcement Learning Solution with Costate Approximation for a Flexible Wing Aircraft

Abouheaf

Gueaieb

2018

2018 IEEE International Conference on Computational Intelligence and Virtual Environments for Measurement Systems and Applicati

Self Cite

View full text Add to dashboard Cite

Abstract-An online adaptive learning approach based on costate function approximation is developed to solve an optimal control problem in real time. The proposed approach tackles the main concerns associated with the classical Dual Heuristic Dynamic Programming techniques in uncertain dynamical environments. It employs a policy iteration paradigm along with adaptive critics to implement the adaptive learning solution. The resultant framework does not need or require prior knowledge of the system dynamics, which makes it suitable for systems with high modeling uncertainties. As a proof of concept, the suggested structure is applied for the auto-pilot control of a flexible wing aircraft with unknown dynamics which are continuously varying at each trim speed condition. Numerical simulations showed that the adaptive control technique was able to learn the system's dynamics and regulate its states as desired in a relatively short time.

show abstract

Security Analysis of Poisoning Attacks Against Multi-agent Reinforcement Learning

Xie

Xiang

et al. 2022

Algorithms and Architectures for Parallel Processing

View full text Add to dashboard Cite

Multi-agent reinforcement learning approach based on reduced value function approximations

Cited by 16 publications

References 18 publications

Model-Free Gradient-Based Adaptive Learning Controller for an Unmanned Flexible Wing Aircraft

Model-Free Gradient-Based Adaptive Learning Controller for an Unmanned Flexible Wing Aircraft

Reinforcement Learning Solution with Costate Approximation for a Flexible Wing Aircraft

Security Analysis of Poisoning Attacks Against Multi-agent Reinforcement Learning

Contact Info

Product

Resources

About