Multi-Robot Flocking Control Based on Deep Reinforcement Learning

Zhu, Pengming; Dai, Wei; Yao, Weijia; Ma, Junchong; Zeng, Zhiwen; Lu, Huimin

doi:10.1109/access.2020.3016951

Cited by 58 publications

(29 citation statements)

References 21 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…The flocking behavior presents interesting characteristics that make it of high interest for the design of artificial systems, particularly in problems of localization, search and rescue. This type of behavior has been observed in birds, is similar to schooling fish and swarming insects, and is characterized by a joint movement of the group without central coordination [12], [13]. The first basic rules of this dynamic were established in 1987 as alignment, cohesion, and separation [14].…”

Section: Introductionmentioning

confidence: 86%

A Software Framework for Self-Organized Flocking System Motion Coordination Research

Sarmiento¹,

Ariza²,

Jacinto³

2022

IJACSA

View full text Add to dashboard Cite

We describe and analyze the basic algorithms for the self-organization of a swarm of robots in coordinated motion as a flock of agents as a strategy for the solution of multi-agent tasks. This analysis allows us to postulate a simulation framework for such systems based on the behavioral rules that characterize the dynamics of these systems. The problem is approached from the perspective of autonomous navigation in an unknown but restricted and locally observable environment. The simulation framework allows defining individually the characteristics of the basic behaviors identified as fundamental to show a flocking behavior, as well as the specific characteristics of the navigation environment. It also allows the incorporation of different path planning approaches to enable the system to navigate the environment for different strategies, both geometric and reactive. The basic behaviors modeled include safe wandering, following, aggregation, dispersion, and homing, which interact to generate flocking behavior, i.e., the swarm aggregates, reach a stable formation and move in an organized fashion toward the target point. The framework concept follows the principle of constrained target tracking, which allows the problem to be solved similarly as a small robot with limited computation would solve it. It is shown that the algorithm and the framework that implements it are robust to the defined constraints and manage to generate the flocking behavior while accomplishing the navigation task. These results provide key guidelines for the implementation of these algorithms on real platforms.

show abstract

Section: Introductionmentioning

confidence: 86%

A Software Framework for Self-Organized Flocking System Motion Coordination Research

Sarmiento¹,

Ariza²,

Jacinto³

2022

IJACSA

View full text Add to dashboard Cite

show abstract

“…MADDPG has been used in many applications such as Wang et al [33] that proposed a data-driven multiagent power grid control scheme using MADDPG for the large-scale energy system with more control options and operating conditions. Zhu et al [34] applied MADDPG to solve the flocking control problem of multi-robot systems in complex environments with dynamic obstacles. Lei et al [35] introduced edge computing between terminals and the cloud using MADDPG to address the drawbacks of the traditional power cloud paradigm.…”

Section: Multiagent Reinforcement Learningmentioning

confidence: 99%

Autonomous Bus Fleet Control Using Multiagent Reinforcement Learning

Wang

Chang

2021

Journal of Advanced Transportation

View full text Add to dashboard Cite

Autonomous buses are becoming increasingly popular and have been widely developed in many countries. However, autonomous buses must learn to navigate the city efficiently to be integrated into public transport systems. Efficient operation of these buses can be achieved by intelligent agents through reinforcement learning. In this study, we investigate the autonomous bus fleet control problem, which appears noisy to the agents owing to random arrivals and incomplete observation of the environment. We propose a multi-agent reinforcement learning method combined with an advanced policy gradient algorithm for this large-scale dynamic optimization problem. An agent-based simulation platform was developed to model the dynamic system of a fixed stop/station loop route, autonomous bus fleet, and passengers. This platform was also applied to assess the performance of the proposed algorithm. The experimental results indicate that the developed algorithm outperforms other reinforcement learning methods in the multi-agent domain. The simulation results also reveal the effectiveness of our proposed algorithm in outperforming the existing scheduled bus system in terms of the bus fleet size and passenger wait times for bus routes with comparatively lesser number of passengers.

show abstract

“…In [6], although the authors realized their work based on LiDAR and an odometer, they only considered the -greedy policy of the DQN with different parameters to update the neural network. In [7], the method was implemented based on virtual robots. Namely, rather than using a simulated model, they directly assumed the effectiveness of the simulation properties of a virtual robot, e.g., the gyration radius and mass, the maximum speed and the maximum acceleration, and so on.…”

Section: Introductionmentioning

confidence: 99%

Autonomous Driving Control Using the DDPG and RDPG Algorithms

et al. 2021

View full text Add to dashboard Cite

Recently, autonomous driving has become one of the most popular topics for smart vehicles. However, traditional control strategies are mostly rule-based, which have poor adaptability to the time-varying traffic conditions. Similarly, they have difficulty coping with unexpected situations that may occur any time in the real-world environment. Hence, in this paper, we exploited Deep Reinforcement Learning (DRL) to enhance the quality and safety of autonomous driving control. Based on the road scenes and self-driving simulation modules provided by AirSim, we used the Deep Deterministic Policy Gradient (DDPG) and Recurrent Deterministic Policy Gradient (RDPG) algorithms, combined with the Convolutional Neural Network (CNN), to realize the autonomous driving control of self-driving cars. In particular, by using the real-time images of the road provided by AirSim as the training data, we carefully formulated an appropriate reward-generation method to improve the convergence speed of the adopted DDPG and RDPG models and the control performance of moving driverless cars.

show abstract

Multi-Robot Flocking Control Based on Deep Reinforcement Learning

Cited by 58 publications

References 21 publications

A Software Framework for Self-Organized Flocking System Motion Coordination Research

A Software Framework for Self-Organized Flocking System Motion Coordination Research

Autonomous Bus Fleet Control Using Multiagent Reinforcement Learning

Autonomous Driving Control Using the DDPG and RDPG Algorithms

Contact Info

Product

Resources

About