2023
DOI: 10.3390/aerospace10050441
|View full text |Cite
|
Sign up to set email alerts
|

Online Trajectory Planning Method for Midcourse Guidance Phase Based on Deep Reinforcement Learning

Abstract: Concerned with the problem of interceptor midcourse guidance trajectory online planning satisfying multiple constraints, an online midcourse guidance trajectory planning method based on deep reinforcement learning (DRL) is proposed. The Markov decision process (MDP) corresponding to the background of a trajectory planning problem is designed, and the key reward function is composed of the final reward and the negative step feedback reward, which lays the foundation for the interceptor training trajectory plann… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
3

Relationship

1
2

Authors

Journals

citations
Cited by 3 publications
(4 citation statements)
references
References 34 publications
0
4
0
Order By: Relevance
“…As a relatively mature algorithm in deep reinforcement learning, the DDPG algorithm has significant advantages over other deep reinforcement learning algorithms (such as deep Q network (DQN), deterministic policy gradient (DPG), etc.) in handling continuous action spaces, efficient gradient optimization, utilizing experience replay buffers, and improving stability [37]. This makes the DDPG algorithm achieve higher performance and efficiency in solving complex continuous control tasks.…”
Section: Initial Solution Trajectories' Rapid Generation Methods Designmentioning
confidence: 99%
See 2 more Smart Citations
“…As a relatively mature algorithm in deep reinforcement learning, the DDPG algorithm has significant advantages over other deep reinforcement learning algorithms (such as deep Q network (DQN), deterministic policy gradient (DPG), etc.) in handling continuous action spaces, efficient gradient optimization, utilizing experience replay buffers, and improving stability [37]. This makes the DDPG algorithm achieve higher performance and efficiency in solving complex continuous control tasks.…”
Section: Initial Solution Trajectories' Rapid Generation Methods Designmentioning
confidence: 99%
“…During the midcourse guidance phase, the interceptor flies at high altitude and high speed for a long time, which is subject to constraints on heat flux density Q, dynamic pressure p, overload n, angle, and control variables. Therefore, the following constraints should also be met [37].…”
Section: Problem Formulationmentioning
confidence: 99%
See 1 more Smart Citation
“…Over the past few decades, substantial progress has been made in motion planning methods for nonholonomic robots, including space rovers. These methods encompass the polynomial interpolation method [5,6], adaptive state lattices [7], homotopy-based methods [8], probabilistic search methods such as rapidly exploring random tree [9], informed RRT* [10], fast marching trees [11,12], reinforcement learning method [13][14][15], numerical optimization methods [16][17][18][19], and others.…”
Section: Introductionmentioning
confidence: 99%