2021
DOI: 10.48550/arxiv.2109.07735
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Decentralized Control of Quadrotor Swarms with End-to-end Deep Reinforcement Learning

Abstract: We demonstrate the possibility of learning drone swarm controllers that are zero-shot transferable to real quadrotors via large-scale multi-agent end-to-end reinforcement learning. We train policies parameterized by neural networks that are capable of controlling individual drones in a swarm in a fully decentralized manner. Our policies, trained in simulated environments with realistic quadrotor physics, demonstrate advanced flocking behaviors, perform aggressive maneuvers in tight formations while avoiding co… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 19 publications
0
2
0
Order By: Relevance
“…A neural network is exploited to approximate the value functions with which a Nash equilibrium on the policies of agents can be found for the consensus of MAS with nonlinear dynamics without model knowledge [31]. Eight quadrotors built on an open-source platform and a very small deep learning network were experimentally shown to achieve movements with maintaining formation [32].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…A neural network is exploited to approximate the value functions with which a Nash equilibrium on the policies of agents can be found for the consensus of MAS with nonlinear dynamics without model knowledge [31]. Eight quadrotors built on an open-source platform and a very small deep learning network were experimentally shown to achieve movements with maintaining formation [32].…”
Section: Introductionmentioning
confidence: 99%
“…Even though RL is an intriguing approach to achieve consensus without model knowledge, it has drawbacks such as a dependency on initial value [32] and convergence withunmeaningful policies [33]. In addition, extending RL from a single agent to multi-agents lays down some challenges such as the heterogeneity of agents, definition of a global goal, knowledge sharing, and the scalability of the number of agents [33].…”
Section: Introductionmentioning
confidence: 99%