Improving Maneuver Strategy in Air Combat by Alternate Freeze Games with a Deep Reinforcement Learning Algorithm

Wang, Zhuang; Li, Hui; Wu, Haolin; Wu, Zhaoxin

doi:10.1155/2020/7180639

Cited by 23 publications

(15 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…5 kg, and r 5 � r 6 � 1 m. Because of two constraints (76), this multimanipulator has 4 independent joints η 1 , η 2 , η 3 , and η 5 with the desired trajectories η 1 d � (π/12), η 2 d � 1.91π+ 0.2 sin(t), η 3 d � 0.51π, and η 5 d � 0.191π, and the initial values of these independent joints are η 1 (0) � π/6, η 2 (0) � 1.92π, η 3 (0) � 2π/3, η 4 (0) � − 0.0565, η 5 (0) � π/4, and η 6 (0) � − 0.0853. According to the given multimanipulator and above parameters, we achieve the necessary models for investigating the control design, including motion dynamic model ( 7), (37), and constraint force (43). In light of eorem 1, the control parameters are given as follows: k 1 � 0.5, η c � 5, η a1 � 5, η a2 � 10, and ] � 0.01. e dynamic model of 3 manipulators with the above parameters and ARL-based motion/force control scheme is established by the m-file script and Simulink in MATLAB software.…”

Section: Simulation Resultsmentioning

confidence: 99%

“…However, it is hard to analytically solve this HJB equation due to leading to a nonlinear partial differential equation. Among the numerical methods which have been considered to solve the HJB equation, the remarkable iterative structure has been developed to solve via online based on the reinforcement learning (RL) principle being inspired by machine learning [27,[31][32][33][34][35][36][37][38][39]. In order to implement the numerical algorithm for solving the HJB equation, there are two major directions, including online actor/critic [31,32] and off-policy technique using integral reinforcement learning (IRL) [38,39].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Adaptive Reinforcement Learning-Enhanced Motion/Force Control Strategy for Multirobot Systems

Nam

Khanh

Nguyen

2021

Mathematical Problems in Engineering

View full text Add to dashboard Cite

This paper presents an adaptive reinforcement learning- (ARL-) based motion/force tracking control scheme consisting of the optimal motion dynamic control law and force control scheme for multimanipulator systems. Specifically, a new additional term and appropriate state vector are employed in designing the ARL technique for time-varying dynamical systems with online actor/critic algorithm to be established by minimizing the squared Bellman error. Additionally, the force control law is designed after obtaining the computation of constraint force coefficient by the Moore–Penrose pseudo-inverse matrix. The tracking effectiveness of the ARL-based optimal control is verified in the closed-loop system by theoretical analysis. Finally, simulation studies are conducted on a system of three manipulators to validate the physical realization of the proposed optimal tracking control design.

show abstract

Section: Simulation Resultsmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Adaptive Reinforcement Learning-Enhanced Motion/Force Control Strategy for Multirobot Systems

Nam

Khanh

Nguyen

2021

Mathematical Problems in Engineering

View full text Add to dashboard Cite

show abstract

“…More recently, deep reinforcement learning (RL) has been applied to this problem space [9] [10] [11] [12] [13] [14]. For example, [12] trained an agent in a custom 3-D environment that selected from a collection of 15 discrete maneuvers and was capable of defeating a human.…”

Section: Related Workmentioning

confidence: 99%

“…[9] evaluated a variety of learning algorithms and scenarios in an AFSIM environment. In general, many of the deep RL approaches surveyed either leveraged low fidelity/dimension simulation environments or abstracted the action space to high level behaviors or tactics [9] [10] [11] [12] [13] [14].…”

Section: Related Workmentioning

confidence: 99%

Hierarchical Reinforcement Learning for Air-to-Air Combat

Pope¹,

Ide²,

Micovic³

et al. 2021

Preprint

View full text Add to dashboard Cite

Artificial Intelligence (AI) is becoming a critical component in the defense industry, as recently demonstrated by DARPA's AlphaDogfight Trials (ADT). ADT sought to vet the feasibility of AI algorithms capable of piloting an F-16 in simulated air-to-air combat. As a participant in ADT, Lockheed Martin's (LM) approach combines a hierarchical architecture with maximum-entropy reinforcement learning (RL), integrates expert knowledge through reward shaping, and supports modularity of policies. This approach achieved a 2 nd place finish in the final ADT event (among eight total competitors) and defeated a graduate of the US Air Force's (USAF) F-16 Weapons Instructor Course in match play.

show abstract

“…Value-based reinforcement learning methods cannot deal with the problem of continuous action space [12][13][14][15]. Lillicrap combined the deterministic policy gradient algorithm [16] and actor-critic framework, and a deep deterministic policy gradient (DDPG) algorithm is proposed to address continuous state space and continuous action space problems [17].…”

Section: Introductionmentioning

confidence: 99%

Research on UCAV Maneuvering Decision Method Based on Heuristic Reinforcement Learning

Wang

Zhang

Rong

et al. 2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

With the rapid development of unmanned combat aerial vehicle (UCAV)-related technologies, UCAVs are playing an increasingly important role in military operations. It has become an inevitable trend in the development of future air combat battlefields that UCAVs complete air combat tasks independently to acquire air superiority. In this paper, the UCAV maneuver decision problem in continuous action space is studied based on the deep reinforcement learning strategy optimization method. The UCAV platform model of continuous action space was established. Focusing on the problem of insufficient exploration ability of Ornstein–Uhlenbeck (OU) exploration strategy in the deep deterministic policy gradient (DDPG) algorithm, a heuristic DDPG algorithm was proposed by introducing heuristic exploration strategy, and then a UCAV air combat maneuver decision method based on a heuristic DDPG algorithm is proposed. The superior performance of the algorithm is verified by comparison with different algorithms in the test environment, and the effectiveness of the decision method is verified by simulation of air combat tasks with different difficulty and attack modes.

show abstract

Improving Maneuver Strategy in Air Combat by Alternate Freeze Games with a Deep Reinforcement Learning Algorithm

Cited by 23 publications

References 33 publications

Adaptive Reinforcement Learning-Enhanced Motion/Force Control Strategy for Multirobot Systems

Adaptive Reinforcement Learning-Enhanced Motion/Force Control Strategy for Multirobot Systems

Hierarchical Reinforcement Learning for Air-to-Air Combat

Research on UCAV Maneuvering Decision Method Based on Heuristic Reinforcement Learning

Contact Info

Product

Resources

About