2019
DOI: 10.1007/s10458-019-09421-1
|View full text |Cite
|
Sign up to set email alerts
|

A survey and critique of multiagent deep reinforcement learning

Abstract: Deep reinforcement learning (RL) has achieved outstanding results in recent years. This has led to a dramatic increase in the number of applications and methods. Recent works have explored learning beyond single-agent scenarios and have considered multiagent learning (MAL) scenarios. Initial results report successes in complex multiagent domains, although there are several challenges to be addressed. The primary goal of this article is to provide a clear overview of current multiagent deep reinforcement learni… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
240
0
5

Year Published

2020
2020
2022
2022

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 437 publications
(270 citation statements)
references
References 198 publications
(397 reference statements)
1
240
0
5
Order By: Relevance
“…However, on the higher levels, where the vehicle is placed in complex situations, like racing, passing intersections, merging, or driving in traffic, the other participants' reactions strongly affect the available choices and possible outcomes. This leads to the area of Multiagent Systems (MAS) [24], which if handled with RL techniques are called Multiagent (Deep) Reinforcement Learning (MARL or MDRL in different sources) [25]. One modeling approach to MARL is the generalization of the original POMDP, by extending it with multiple actions and observation sets for each agent, or even various rewards in case different agents have different goals.…”
Section: Multiagent Reinforcement Learningmentioning
confidence: 99%
“…However, on the higher levels, where the vehicle is placed in complex situations, like racing, passing intersections, merging, or driving in traffic, the other participants' reactions strongly affect the available choices and possible outcomes. This leads to the area of Multiagent Systems (MAS) [24], which if handled with RL techniques are called Multiagent (Deep) Reinforcement Learning (MARL or MDRL in different sources) [25]. One modeling approach to MARL is the generalization of the original POMDP, by extending it with multiple actions and observation sets for each agent, or even various rewards in case different agents have different goals.…”
Section: Multiagent Reinforcement Learningmentioning
confidence: 99%
“…As a function approximator, DNN can be applied to address the above limitations by approximating the state-action function with the parameters of neural network (NN). Combining the DNN and the RL algorithm has two advantages: ① the strong feature extraction ability of DNN helps avoid the manually feature design process, and the control decisions can be directly derived from the raw inputs through end-to-end learning procedure; ② DNN helps RL generalize problems with a large state space [24]. Despite these benefits, there are also some challenges, i.e., the training data of DNN are typically assumed to be independent and identically distributed [25].…”
Section: B Drlmentioning
confidence: 99%
“…Besides, a slight update of Q parameter may cause a huge oscillation in the strategy, which will bring a variation in the distribution of training samples. Experience replay and target network mechanism are developed in order to solve these issue [31]. In particular, replay buffer is applied to store the state transition samples (s, a, r, s ) generated at each episode which can be randomly sampled for learning.…”
Section: Volume 8 2020mentioning
confidence: 99%