Online Trajectory Planning Method for Midcourse Guidance Phase Based on Deep Reinforcement Learning

Li, Wanli; Li, Jiong; Shao, Lei; Li, Mingjie

doi:10.3390/aerospace10050441

Cited by 3 publications

(4 citation statements)

References 34 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As a relatively mature algorithm in deep reinforcement learning, the DDPG algorithm has significant advantages over other deep reinforcement learning algorithms (such as deep Q network (DQN), deterministic policy gradient (DPG), etc.) in handling continuous action spaces, efficient gradient optimization, utilizing experience replay buffers, and improving stability [37]. This makes the DDPG algorithm achieve higher performance and efficiency in solving complex continuous control tasks.…”

Section: Initial Solution Trajectories' Rapid Generation Methods Designmentioning

confidence: 99%

“…During the midcourse guidance phase, the interceptor flies at high altitude and high speed for a long time, which is subject to constraints on heat flux density Q, dynamic pressure p, overload n, angle, and control variables. Therefore, the following constraints should also be met [37].…”

Section: Problem Formulationmentioning

confidence: 99%

“…The trained strategy function is the final initial solution trajectory fast generation method. The specific model of the DDPG algorithm used in this paper and the interceptor's guidance motion model used for the interaction between the agent and the environment can be found in Ref [37], and they will not be detailed here due to space limitations. Figure 2 shows the training framework for the trajectory planning task based on the DDPG algorithm.…”

Section: Initial Solution Trajectory Rapid Generationmentioning

confidence: 99%

See 2 more Smart Citations

DDPG-Based Convex Programming Algorithm for the Midcourse Guidance Trajectory of Interceptor

Li,

et al. 2024

Aerospace

Self Cite

View full text Add to dashboard Cite

To address the problem of low accuracy and efficiency in trajectory planning algorithms for interceptors facing multiple constraints during the midcourse guidance phase, an improved trajectory convex programming method based on the lateral distance domain is proposed. This algorithm can achieve fast trajectory planning, reduce the approximation error of the planned trajectory, and improve the accuracy of trajectory guidance. First, the concept of lateral distance domain is proposed, and the motion model of the midcourse guidance segment in the interceptor is converted from the time domain to the lateral distance domain. Second, the motion model and multiple constraints are convexly and discretely transformed, and the discrete trajectory convex model is established in the lateral distance domain. Third, the deep reinforcement learning algorithm is used to learn and train the initial solution of trajectory convex programming, and a high-quality initial solution trajectory is obtained. Finally, a dynamic adjustment method based on the distribution of approximate solution errors is designed to achieve efficient dynamic adjustment of grid points in iterative solving. The simulation experiments show that the improved trajectory convex programming algorithm proposed in this paper not only improves the accuracy and efficiency of the algorithm but also has good optimization performance.

show abstract

Section: Initial Solution Trajectories' Rapid Generation Methods Designmentioning

confidence: 99%

Section: Problem Formulationmentioning

confidence: 99%

Section: Initial Solution Trajectory Rapid Generationmentioning

confidence: 99%

See 1 more Smart Citation

DDPG-Based Convex Programming Algorithm for the Midcourse Guidance Trajectory of Interceptor

Li,

et al. 2024

Aerospace

Self Cite

View full text Add to dashboard Cite

show abstract

“…Over the past few decades, substantial progress has been made in motion planning methods for nonholonomic robots, including space rovers. These methods encompass the polynomial interpolation method [5,6], adaptive state lattices [7], homotopy-based methods [8], probabilistic search methods such as rapidly exploring random tree [9], informed RRT* [10], fast marching trees [11,12], reinforcement learning method [13][14][15], numerical optimization methods [16][17][18][19], and others.…”

Section: Introductionmentioning

confidence: 99%

Trajectory Optimization for the Nonholonomic Space Rover in Cluttered Environments Using Safe Convex Corridors

Liang

Gao

et al. 2023

Aerospace

View full text Add to dashboard Cite

Due to the limitation of space rover onboard computing resources and energy, there is an urgent need for high-quality drive trajectories in complex environments, which can be provided by delicately designed motion optimization methods. The nonconvexity of the collision avoidance constraints poses a significant challenge to the optimization-based motion planning of nonholonomic vehicles, especially in unstructured cluttered environments. In this paper, a novel obstacle decomposition approach, which swiftly decomposes nonconvex obstacles into their constituent convex substructures while concurrently minimizing the proliferation of resultant subobstacles, is proposed. A safe convex corridor construction method is introduced to formulate the collision avoidance constraints. The numerical approximation methods are applied to transfer the resulting continuous motion optimization problem to a nonlinear programming problem (NLP). Simulation experiments are conducted to illustrate the feasibility and superiority of the proposed methods over the rectangle safe corridor method and the area method.

show abstract

A deep reinforcement learning approach incorporating genetic algorithm for missile path planning

Xu,

Bi,

Zhang

et al. 2023

Int. J. Mach. Learn. & Cyber.

View full text Add to dashboard Cite

Online Trajectory Planning Method for Midcourse Guidance Phase Based on Deep Reinforcement Learning

Cited by 3 publications

References 34 publications

DDPG-Based Convex Programming Algorithm for the Midcourse Guidance Trajectory of Interceptor

DDPG-Based Convex Programming Algorithm for the Midcourse Guidance Trajectory of Interceptor

Trajectory Optimization for the Nonholonomic Space Rover in Cluttered Environments Using Safe Convex Corridors

A deep reinforcement learning approach incorporating genetic algorithm for missile path planning

Contact Info

Product

Resources

About