2021
DOI: 10.1109/lra.2021.3068952
|View full text |Cite
|
Sign up to set email alerts
|

Decentralized Multi-Agent Pursuit Using Deep Reinforcement Learning

Abstract: Pursuit-evasion is the problem of capturing mobile targets with one or more pursuers. We use deep reinforcement learning for pursuing an omni-directional target with multiple, homogeneous agents that are subject to unicycle kinematic constraints. We use shared experience to train a policy for a given number of pursuers that is executed independently by each agent at run-time. The training benefits from curriculum learning, a sweeping-angle ordering to locally represent neighboring agents and encouraging good f… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
37
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
3
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 86 publications
(37 citation statements)
references
References 34 publications
0
37
0
Order By: Relevance
“…Theorem 1 (PE game): For differential game (1)(2)(3) (11), the optimal strategies are given by (20) and the corresponding value function is V (x) = t E (x), where t E (x) is given by (19).…”
Section: Pe Gamementioning
confidence: 99%
See 1 more Smart Citation
“…Theorem 1 (PE game): For differential game (1)(2)(3) (11), the optimal strategies are given by (20) and the corresponding value function is V (x) = t E (x), where t E (x) is given by (19).…”
Section: Pe Gamementioning
confidence: 99%
“…Via modeling in differential games and calculating numerically, reach-avoid method is developed for TAD-like problems [17]- [19], where the player has to come to a region while avoiding another region. Besides, reinforcement learning (RL) is also used to solve these games or HJI equation in many works [10], [20]. However, these numerical methods have obvious weakness in computation time and solution accuracy.…”
Section: Introductionmentioning
confidence: 99%
“…With respect to cooperative MARL research for the MVP game, the multi-agent system is modeled using Markov decision processes (MDP) [9], and a neural network can be used to approximate the complex objective function [10]. Cristino et al used the Twin Delayed Deep Deterministic Policy Gradient (TD3) to demonstrate a real-world pursuit-evasion in open environment with boundaries [11]. Timothy used the Deep Deterministic Policy Gradient (DDPG) with omnidirectional agents [12].…”
Section: Introductionmentioning
confidence: 99%
“…Recently, deep reinforcement learning (DRL)-based navigation techniques have made rapid progress. The DRL has been proven to be applied to various mobile robotics fields, such as collision avoidance, object transportation, multi-robot navigation, and social navigation [9][10][11]. Among them, DRL-based object transportation techniques have attracted attention from many researchers because DRL can solve tricky issues of conventional methods [12][13][14].…”
Section: Introductionmentioning
confidence: 99%