2017 IEEE International Symposium on Robotics and Intelligent Sensors (IRIS) 2017
DOI: 10.1109/iris.2017.8250107
|View full text |Cite
|
Sign up to set email alerts
|

Multi-agent reinforcement learning approach based on reduced value function approximations

Abstract: This paper introduces novel online adaptive Reinforcement Learning approach based on Policy Iteration for multi-agent systems interacting on graphs. The approach uses reduced value functions to solve the coupled Bellman and Hamilton-Jacobi-Bellman equations for multi-agent systems. This done using only partial knowledge about the agents' dynamics. The convergence of the approach is shown to depend on the properties of the communication graph. The Policy Iteration approach is implemented in real-time using neur… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
19
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
5
1
1

Relationship

3
4

Authors

Journals

citations
Cited by 16 publications
(19 citation statements)
references
References 18 publications
0
19
0
Order By: Relevance
“…Typical optimal control methods tend to solve the underlying Hamilton-Jacobi-Bellman (HJB) equation of the dynamical system by applying the optimality principles [22,23]. An optimal control problem is usually formulated as an optimization problem with a cost function that identifies the optimization objectives and a mathematical process to find the respective optimal strategies [6,7,18,[22][23][24][25][26][27][28]. To implement the optimal control solutions stemming from the ADP approaches, numerous solving frameworks are considered based on combinations of Reinforcement Learning (RL) and adaptive critics [1,5,18,25,27].…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations
“…Typical optimal control methods tend to solve the underlying Hamilton-Jacobi-Bellman (HJB) equation of the dynamical system by applying the optimality principles [22,23]. An optimal control problem is usually formulated as an optimization problem with a cost function that identifies the optimization objectives and a mathematical process to find the respective optimal strategies [6,7,18,[22][23][24][25][26][27][28]. To implement the optimal control solutions stemming from the ADP approaches, numerous solving frameworks are considered based on combinations of Reinforcement Learning (RL) and adaptive critics [1,5,18,25,27].…”
Section: Introductionmentioning
confidence: 99%
“…The sequence of these coupled steps can be implemented with either value or policy iteration method [18]. RL has also been proposed to solve problems with multi-agent structures and objectives [29] as well as cooperative control problems using dynamic graphical games [21,26,30]. Action Dependent Dual Heuristic Dynamic Programming (ADDHP) depends on the system's dynamic model [7,26,28].…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations
“…The optimal control problem finds the necessity optimality conditions and hence the optimal strategies [15]. Reinforcement Learning is used to solve the synchronization control problem online in [16]- [18].…”
Section: Introductionmentioning
confidence: 99%