2021
DOI: 10.1002/rnc.5719
|View full text |Cite
|
Sign up to set email alerts
|

Optimal game theoretic solution of the pursuit‐evasion intercept problem using on‐policy reinforcement learning

Abstract: This article presents a rigorous formulation for the pursuit‐evasion (PE) game when velocity constraints are imposed on agents of the game or players. The game is formulated as an infinite‐horizon problem using a non‐quadratic functional, then sufficient conditions are derived to prove capture in a finite‐time. A novel tracking Hamilton–Jacobi–Isaacs (HJI) equation associated with the non‐quadratic value function is employed, which is solved for Nash equilibrium velocity policies for each agent with arbitrary … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
13
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 26 publications
(13 citation statements)
references
References 31 publications
0
13
0
Order By: Relevance
“…We can obtain the controls of pursuer and evader which should be adopted in the next interval as (30):…”
Section: Policy Iterationmentioning
confidence: 99%
See 1 more Smart Citation
“…We can obtain the controls of pursuer and evader which should be adopted in the next interval as (30):…”
Section: Policy Iterationmentioning
confidence: 99%
“…However, the system information about both sides of the game must be obtained completely. Kartal et al [30] used the synchronous tuning algorithm in the pursuit-evasion game of the first-order system to obtain the capture conditions of agents in the game and reached the Nash equilibrium. Zhang et al [31] and Li et al [32] determined the scheme's feasibility in distributed systems.…”
Section: Introductionmentioning
confidence: 99%
“… Due to the limitation of the driving force, the input of the movement mode of the pursuer and the evader is limited to the constants and . The input of the movement mode of the pursuer also satisfies a certain proportion relationship [ 23 , 24 ]. Because of the above assumption, the pursuer is slower than the evader, i.e., The location of the line of defense is determined, as are the number of pursuers and evaders.…”
Section: Problem Formulationmentioning
confidence: 99%
“…[7][8][9][10] Likewise, perimeter-defense problems are another variant of PEGs wherein the defender team is tasked to capture the intruders before the latter breach the target perimeter. Hamilton-Jacobi-Bellman-Isaacs equation is one of the conventional tools to address perimeter-defense problems, 11,12 however, this is not suitable for team games or the type of sequential arrival games considered in this work. A perimeter defense problem in a planar conical environment is studied 13 recently where two algorithms were presented.…”
Section: Introductionmentioning
confidence: 99%