2020
DOI: 10.1155/2020/7180639
|View full text |Cite
|
Sign up to set email alerts
|

Improving Maneuver Strategy in Air Combat by Alternate Freeze Games with a Deep Reinforcement Learning Algorithm

Abstract: In a one-on-one air combat game, the opponent’s maneuver strategy is usually not deterministic, which leads us to consider a variety of opponent’s strategies when designing our maneuver strategy. In this paper, an alternate freeze game framework based on deep reinforcement learning is proposed to generate the maneuver strategy in an air combat pursuit. The maneuver strategy agents for aircraft guidance of both sides are designed in a flight level with fixed velocity and the one-on-one air combat scenario. Midd… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
15
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 23 publications
(15 citation statements)
references
References 33 publications
0
15
0
Order By: Relevance
“…5 kg, and r 5 � r 6 � 1 m. Because of two constraints (76), this multimanipulator has 4 independent joints η 1 , η 2 , η 3 , and η 5 with the desired trajectories η 1 d � (π/12), η 2 d � 1.91π+ 0.2 sin(t), η 3 d � 0.51π, and η 5 d � 0.191π, and the initial values of these independent joints are η 1 (0) � π/6, η 2 (0) � 1.92π, η 3 (0) � 2π/3, η 4 (0) � − 0.0565, η 5 (0) � π/4, and η 6 (0) � − 0.0853. According to the given multimanipulator and above parameters, we achieve the necessary models for investigating the control design, including motion dynamic model ( 7), (37), and constraint force (43). In light of eorem 1, the control parameters are given as follows: k 1 � 0.5, η c � 5, η a1 � 5, η a2 � 10, and ] � 0.01. e dynamic model of 3 manipulators with the above parameters and ARL-based motion/force control scheme is established by the m-file script and Simulink in MATLAB software.…”
Section: Simulation Resultsmentioning
confidence: 99%
See 1 more Smart Citation
“…5 kg, and r 5 � r 6 � 1 m. Because of two constraints (76), this multimanipulator has 4 independent joints η 1 , η 2 , η 3 , and η 5 with the desired trajectories η 1 d � (π/12), η 2 d � 1.91π+ 0.2 sin(t), η 3 d � 0.51π, and η 5 d � 0.191π, and the initial values of these independent joints are η 1 (0) � π/6, η 2 (0) � 1.92π, η 3 (0) � 2π/3, η 4 (0) � − 0.0565, η 5 (0) � π/4, and η 6 (0) � − 0.0853. According to the given multimanipulator and above parameters, we achieve the necessary models for investigating the control design, including motion dynamic model ( 7), (37), and constraint force (43). In light of eorem 1, the control parameters are given as follows: k 1 � 0.5, η c � 5, η a1 � 5, η a2 � 10, and ] � 0.01. e dynamic model of 3 manipulators with the above parameters and ARL-based motion/force control scheme is established by the m-file script and Simulink in MATLAB software.…”
Section: Simulation Resultsmentioning
confidence: 99%
“…However, it is hard to analytically solve this HJB equation due to leading to a nonlinear partial differential equation. Among the numerical methods which have been considered to solve the HJB equation, the remarkable iterative structure has been developed to solve via online based on the reinforcement learning (RL) principle being inspired by machine learning [27,[31][32][33][34][35][36][37][38][39]. In order to implement the numerical algorithm for solving the HJB equation, there are two major directions, including online actor/critic [31,32] and off-policy technique using integral reinforcement learning (IRL) [38,39].…”
Section: Introductionmentioning
confidence: 99%
“…More recently, deep reinforcement learning (RL) has been applied to this problem space [9] [10] [11] [12] [13] [14]. For example, [12] trained an agent in a custom 3-D environment that selected from a collection of 15 discrete maneuvers and was capable of defeating a human.…”
Section: Related Workmentioning
confidence: 99%
“…[9] evaluated a variety of learning algorithms and scenarios in an AFSIM environment. In general, many of the deep RL approaches surveyed either leveraged low fidelity/dimension simulation environments or abstracted the action space to high level behaviors or tactics [9] [10] [11] [12] [13] [14].…”
Section: Related Workmentioning
confidence: 99%
“…Value-based reinforcement learning methods cannot deal with the problem of continuous action space [12][13][14][15]. Lillicrap combined the deterministic policy gradient algorithm [16] and actor-critic framework, and a deep deterministic policy gradient (DDPG) algorithm is proposed to address continuous state space and continuous action space problems [17].…”
Section: Introductionmentioning
confidence: 99%