A survey and critique of multiagent deep reinforcement learning

Hernández-Leal, Pablo; Kartal, Bilal; Taylor, Matthew E.

doi:10.1007/s10458-019-09421-1

Cited by 437 publications

(270 citation statements)

References 198 publications

(397 reference statements)

Supporting

Mentioning

240

Contrasting

Unclassified

Order By: Relevance

“…However, on the higher levels, where the vehicle is placed in complex situations, like racing, passing intersections, merging, or driving in traffic, the other participants' reactions strongly affect the available choices and possible outcomes. This leads to the area of Multiagent Systems (MAS) [24], which if handled with RL techniques are called Multiagent (Deep) Reinforcement Learning (MARL or MDRL in different sources) [25]. One modeling approach to MARL is the generalization of the original POMDP, by extending it with multiple actions and observation sets for each agent, or even various rewards in case different agents have different goals.…”

Section: Multiagent Reinforcement Learningmentioning

confidence: 99%

Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles

Aradi

2022

IEEE Trans. Intell. Transport. Syst.

343

View full text Add to dashboard Cite

Academic research in the field of autonomous vehicles has reached high popularity in recent years related to several topics as sensor technologies, V2X communications, safety, security, decision making, control, and even legal and standardization rules. Besides classic control design approaches, Artificial Intelligence and Machine Learning methods are present in almost all of these fields. Another part of research focuses on different layers of Motion Planning, such as strategic decisions, trajectory planning, and control. A wide range of techniques in Machine Learning itself have been developed, and this article describes one of these fields, Deep Reinforcement Learning (DRL). The paper provides insight into the hierarchical motion planning problem and describes the basics of DRL. The main elements of designing such a system are the modeling of the environment, the modeling abstractions, the description of the state and the perception models, the appropriate rewarding, and the realization of the underlying neural network. The paper describes vehicle models, simulation possibilities and computational requirements. Strategic decisions on different layers and the observation models, e.g., continuous and discrete state representations, grid-based, and camera-based solutions are presented. The paper surveys the state-of-art solutions systematized by the different tasks and levels of autonomous driving, such as carfollowing, lane-keeping, trajectory following, merging, or driving in dense traffic. Finally, open questions and future challenges are discussed.

show abstract

Section: Multiagent Reinforcement Learningmentioning

confidence: 99%

Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles

Aradi

2022

IEEE Trans. Intell. Transport. Syst.

343

View full text Add to dashboard Cite

show abstract

“…As a function approximator, DNN can be applied to address the above limitations by approximating the state-action function with the parameters of neural network (NN). Combining the DNN and the RL algorithm has two advantages: ① the strong feature extraction ability of DNN helps avoid the manually feature design process, and the control decisions can be directly derived from the raw inputs through end-to-end learning procedure; ② DNN helps RL generalize problems with a large state space [24]. Despite these benefits, there are also some challenges, i.e., the training data of DNN are typically assumed to be independent and identically distributed [25].…”

Section: B Drlmentioning

confidence: 99%

Reinforcement Learning and Its Applications in Modern Power and Energy Systems: A Review

Cao

Zhao

et al. 2020

Journal of Modern Power Systems and Clean Energy

260

View full text Add to dashboard Cite

With the growing integration of distributed energy resources (DERs), flexible loads, and other emerging technologies, there are increasing complexities and uncertainties for modern power and energy systems. This brings great challenges to the operation and control. Besides, with the deployment of advanced sensor and smart meters, a large number of data are generated, which brings opportunities for novel data-driven methods to deal with complicated operation and control issues. Among them, reinforcement learning (RL) is one of the most widely promoted methods for control and optimization problems. This paper provides a comprehensive literature review of RL in terms of basic ideas, various types of algorithms, and their applications in power and energy systems. The challenges and further works are also discussed.

show abstract

“…Besides, a slight update of Q parameter may cause a huge oscillation in the strategy, which will bring a variation in the distribution of training samples. Experience replay and target network mechanism are developed in order to solve these issue [31]. In particular, replay buffer is applied to store the state transition samples (s, a, r, s ) generated at each episode which can be randomly sampled for learning.…”

Section: Volume 8 2020mentioning

confidence: 99%

Minimum Throughput Maximization for Multi-UAV Enabled WPCN: A Deep Reinforcement Learning Method

Tang

Song

et al. 2020

IEEE Access

View full text Add to dashboard Cite

This paper investigates joint unmanned aerial vehicle (UAV) trajectory planning and time resource allocation for minimum throughput maximization in a multiple UAV-enabled wireless powered communication network (WPCN). In particular, the UAVs perform as base stations (BS) to broadcast energy signals in the downlink to charge IoT devices, while the IoT devices send their independent information in the uplink by utilizing the collected energy. The formulated throughput optimization problem which involves joint optimization of 3D path design and channel resource assignment with the constraint of flight speed of UAVs and uplink transmit power of IoT devices, is not convex and thus is extremely difficult to solve directly. We take advantage of the multi-agent deep Q learning (DQL) strategy and propose a novel algorithm to tackle this problem. Simulation results indicate that the proposed DQL-based algorithm significantly improve performance gain in terms of minimum throughput maximization compared with the conventional WPCN scheme. INDEX TERMS Unmanned aerial vehicle (UAV), wireless powered communication network (WPCN), Internet of Things (IoT), trajectory design, deep reinforcement learning (DRL).

show abstract

A survey and critique of multiagent deep reinforcement learning

Cited by 437 publications

References 198 publications

Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles

Survey of Deep Reinforcement Learning for Motion Planning of Autonomous Vehicles

Reinforcement Learning and Its Applications in Modern Power and Energy Systems: A Review

Minimum Throughput Maximization for Multi-UAV Enabled WPCN: A Deep Reinforcement Learning Method

Contact Info

Product

Resources

About