Path planning of multi-agent systems in unknown environment with neural kernel smoothing and reinforcement learning

Luviano-Cruz, David; Yu, Wen

doi:10.1016/j.neucom.2016.08.108

Cited by 57 publications

(10 citation statements)

References 29 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Table 1 summarizes the DRL multi-robot path planning methods and the advantages and limitations of each method. From the information in Table 1, it can be summarized that shared parameter type algorithms such as MADDPG and ME-MADDPG can be used in dynamic and complex environments [1][2][3][4] ; decentralized architectures such as DQN and DDQN can be considered in stable environments [5][6][7] ; large robotic systems facing a large number of dynamic obstacles can be considered using algorithms such as A2C, A3C and TDueling [8][9][10][11] . Validity validated on only a few teams of agents.…”

Section: Drl Multi-robot Path Planning Methodsmentioning

confidence: 99%

Applications and Challenges of Deep Reinforcement Learning in Multi-robot Path Planning

Qiu

Cheng

2021

JERA

View full text Add to dashboard Cite

With the rapid advancement of deep reinforcement learning (DRL) in multi-agent systems, a variety of practical application challenges and solutions in the direction of multi-agent deep reinforcement learning (MADRL) are surfacing. Path planning in a collision-free environment is essential for many robots to do tasks quickly and efficiently, and path planning for multiple robots using deep reinforcement learning is a new research area in the field of robotics and artificial intelligence. In this paper, we sort out the training methods for multi-robot path planning, as well as summarize the practical applications in the field of DRL-based multi-robot path planning based on the methods; finally, we suggest possible research directions for researchers.

show abstract

Section: Drl Multi-robot Path Planning Methodsmentioning

confidence: 99%

Applications and Challenges of Deep Reinforcement Learning in Multi-robot Path Planning

Qiu

Cheng

2021

JERA

View full text Add to dashboard Cite

show abstract

“…Cruz and Yu [35] proposed a method that combines kernel smoothing and the DRL of the WoLF-Policy Hill Climbing algorithm to solve the difficulty of traditional reinforcement learning in path planning in an unfamiliar environment. Without prior knowledge, the discrete action space was used to approximate the state of MARL by kernel smoothing, thereby reducing the state space in the Q table.…”

Section: Path Planning Approach Based On Rlmentioning

confidence: 99%

Improved Multi-Agent Deep Deterministic Policy Gradient for Path Planning-Based Crowd Simulation

Zheng¹,

Liu²

2019

IEEE Access

View full text Add to dashboard Cite

Deep reinforcement learning (DRL) has been proved to be more suitable than reinforcement learning for path planning in large-scale scenarios. In order to more effectively complete the DRL-based collaborative path planning in crowd evacuation, it is necessary to consider the space expansion problem brought by the increase of the number of agents. In addition, it is often faced with complicated circumstances, such as exit selection and congestion in crowd evacuation. However, few existing works have integrated these two aspects jointly. To solve this problem, we propose a planning approach for crowd evacuation based on the improved DRL algorithm, which will improve evacuation efficiency for large-scale crowd path planning. First, we propose a framework of congestion detection-based multi-agent reinforcement learning, the framework divides the crowd into leaders and followers and simulates leaders with a multi-agent system, it considers the congestion detection area is set up to evaluate the degree of congestion at each exit. Next, under the specification of this framework, we propose the improved Multi-Agent Deep Deterministic Policy Gradient (IMADDPG) algorithm, which adds the mean field network to maximize the returns of other agents, enables all agents to maximize the performance of a collaborative planning task in our training period. Then, we implement the hierarchical path planning method, which upper layer is based on the IMADDPG algorithm to solve the global path, and lower layer uses the reciprocal velocity obstacles method to avoid collisions in crowds. Finally, we simulate the proposed method with the crowd simulation system. The experimental results show the effectiveness of our method.INDEX TERMS Deep reinforcement learning, multi-agent reinforcement learning, path planning, crowd simulation for evacuation, improved multi-agent deep deterministic policy gradient algorithm.

show abstract

“…The idea of utilizing Q-learning with the obstacle aware to generate the shortest track from the source to destination in a grid-based divided sub-region was been proposed in Aleksandr et al [ 18 ], Amit et al [ 19 ] and Soong et al [ 20 ]. David et al amplifies this strategy to different robot specialists [ 21 ]. Yuan et al [ 22 ] utilized the RNN gated recurrent unit (GRU) framework to plan an ideal way from the source to the destination straightforwardly.…”

Section: Introductionmentioning

confidence: 99%

Coverage Path Planning Using Reinforcement Learning-Based TSP for hTetran—A Polyabolo-Inspired Self-Reconfigurable Tiling Robot

Veerajagadheswar

Kyaw

et al. 2021

Sensors

View full text Add to dashboard Cite

One of the critical challenges in deploying the cleaning robots is the completion of covering the entire area. Current tiling robots for area coverage have fixed forms and are limited to cleaning only certain areas. The reconfigurable system is the creative answer to such an optimal coverage problem. The tiling robot’s goal enables the complete coverage of the entire area by reconfiguring to different shapes according to the area’s needs. In the particular sequencing of navigation, it is essential to have a structure that allows the robot to extend the coverage range while saving energy usage during navigation. This implies that the robot is able to cover larger areas entirely with the least required actions. This paper presents a complete path planning (CPP) for hTetran, a polyabolo tiled robot, based on a TSP-based reinforcement learning optimization. This structure simultaneously produces robot shapes and sequential trajectories whilst maximizing the reward of the trained reinforcement learning (RL) model within the predefined polyabolo-based tileset. To this end, a reinforcement learning-based travel sales problem (TSP) with proximal policy optimization (PPO) algorithm was trained using the complementary learning computation of the TSP sequencing. The reconstructive results of the proposed RL-TSP-based CPP for hTetran were compared in terms of energy and time spent with the conventional tiled hypothetical models that incorporate TSP solved through an evolutionary based ant colony optimization (ACO) approach. The CPP demonstrates an ability to generate an ideal Pareto optima trajectory that enhances the robot’s navigation inside the real environment with the least energy and time spent in the company of conventional techniques.

show abstract

Path planning of multi-agent systems in unknown environment with neural kernel smoothing and reinforcement learning

Cited by 57 publications

References 29 publications

Applications and Challenges of Deep Reinforcement Learning in Multi-robot Path Planning

Applications and Challenges of Deep Reinforcement Learning in Multi-robot Path Planning

Improved Multi-Agent Deep Deterministic Policy Gradient for Path Planning-Based Crowd Simulation

Coverage Path Planning Using Reinforcement Learning-Based TSP for hTetran—A Polyabolo-Inspired Self-Reconfigurable Tiling Robot

Contact Info

Product

Resources

About