2023
DOI: 10.3390/aerospace10070642
|View full text |Cite
|
Sign up to set email alerts
|

A Reinforcement Learning Method Based on an Improved Sampling Mechanism for Unmanned Aerial Vehicle Penetration

Abstract: The penetration of unmanned aerial vehicles (UAVs) is an important aspect of UAV games. In recent years, UAV penetration has generally been solved using artificial intelligence methods such as reinforcement learning. However, the high sample demand of the reinforcement learning method poses a significant challenge specifically in the context of UAV games. To improve the sample utilization in UAV penetration, this paper innovatively proposes an improved sampling mechanism called task completion division (TCD) a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 36 publications
0
2
0
Order By: Relevance
“…In [20], Cao et al studied autonomous maneuver decisions for UCAV air combat based on the double deep Q network algorithm (DDQN) and stochastic game theory, which further boosted the performance of the UCAV in different combat cases. To compensate for the low training efficiency caused by simple sampling mechanisms, Wang et al proposed a task completion division soft actor-critic (TCD-SAC) algorithm for UAV penetration [21]. However, these studies did not take into account the uncertainty of environmental information obtained by agents in the real world, which leads to the degradation of DRL algorithm performance.…”
Section: Related Workmentioning
confidence: 99%
“…In [20], Cao et al studied autonomous maneuver decisions for UCAV air combat based on the double deep Q network algorithm (DDQN) and stochastic game theory, which further boosted the performance of the UCAV in different combat cases. To compensate for the low training efficiency caused by simple sampling mechanisms, Wang et al proposed a task completion division soft actor-critic (TCD-SAC) algorithm for UAV penetration [21]. However, these studies did not take into account the uncertainty of environmental information obtained by agents in the real world, which leads to the degradation of DRL algorithm performance.…”
Section: Related Workmentioning
confidence: 99%
“…In scenarios where only radar detection is considered, and the reward is the sparsest, Ma Zijie et al [23] proposed an improved deep reinforcement learning algorithm to enhance cruise missiles' penetration trajectory planning capability when facing dynamic early warning radar threats. Wang Y et al [24] combined Task Completion Division (TCD) with the Soft Actor-Critic (SAC) algorithm to form the TCD-SAC algorithm, proposing a reinforcement learning method based on an improved sampling mechanism to enhance the penetration capability of unmanned aerial vehicles in air defense systems, with the improved sampling mechanism effectively mitigating the training difficulties caused by sparse rewards.…”
Section: Introductionmentioning
confidence: 99%