Heuristic Q-learning based on experience replay for three-dimensional path planning of the unmanned aerial vehicle

Xie, Ronglei; Zhang, Meng; Zhou, Yaoming; Ma, Yunpeng; Wu, Zhe

doi:10.1177/0036850419879024

Cited by 17 publications

(8 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…RL algorithms for path planning are the most abundant in the state of the art. For example, Xie et al use the Q-Learning strategy for three-dimensional path planning [24]. The notion of Heuristic Q-Learning was introduced.…”

Section: Evolutionarymentioning

confidence: 99%

UAV swarm path planning with reinforcement learning for field prospecting

2022

View full text Add to dashboard Cite

There has been steady growth in the adoption of Unmanned Aerial Vehicle (UAV) swarms by operators due to their time and cost benefits. However, this kind of system faces an important problem, which is the calculation of many optimal paths for each UAV. Solving this problem would allow control of many UAVs without human intervention while saving battery between recharges and performing several tasks simultaneously. The main aim is to develop a Reinforcement Learning based system capable of calculating the optimal flight path for a UAV swarm. This method stands out for its ability to learn through trial and error, allowing the model to adjust itself. The aim of these paths is to achieve full coverage of an overflight area for tasks such as field prospection, regardless of map size and the number of UAVs in the swarm. It is not necessary to establish targets or to have any previous knowledge other than the given map. Experiments have been conducted to determine whether it is optimal to establish a single control for all UAVs in the swarm or a control for each UAV. The results show that it is better to use one control for all UAVs because of the shorter flight time. In addition, the flight time is greatly affected by the size of the map. The results give starting points for future research, such as finding the optimal map size for each situation.

show abstract

Section: Evolutionarymentioning

confidence: 99%

UAV swarm path planning with reinforcement learning for field prospecting

2022

View full text Add to dashboard Cite

show abstract

“…Therefore, it is required to have strong adaptive ability to such uncertainty. RL provides a better idea for this kind of problem by using historical data to obtain the nonlinear function relationship between approximate fitting state and overall performance [21][22][23][24].…”

Section: Annotation Demo Sectionmentioning

confidence: 99%

UAV Path Planning Based on Multicritic-Delayed Deep Deterministic Policy Gradient

Liu

et al. 2022

Wireless Communications and Mobile Computing

View full text Add to dashboard Cite

Deep deterministic policy gradient (DDPG) algorithm is a reinforcement learning method, which has been widely used in UAV path planning. However, the critic network of DDPG is frequently updated in the training process. It leads to an inevitable overestimation problem and increases the training computational complexity. Therefore, this paper presents a multicritic-delayed DDPG method for solving the UAV path planning. It uses multicritic networks and delayed learning methods to reduce the overestimation problem of DDPG and adds noise to improve the robustness in the real environment. Moreover, a UAV mission platform is built to train and evaluate the effectiveness and robustness of the proposed method. Simulation results show that the proposed algorithm has a higher convergence speed, a better convergence effect, and stability. It indicates that UAV can learn more knowledge from the complex environment.

show abstract

“…They classified these approaches into five main categories. These categories include classical methods [32,33,34], heuristics [35,36,37,38,39,40], meta-heuristics [41,42,43], machine learning [44,45,46], and hybrid algorithms [47,48,49].…”

Section: Introductionmentioning

confidence: 99%

RCS: a fast path planning algorithm for Unmanned Aerial Vehicles

Divkoti

Nouri-Baygi

2023

Preprint

View full text Add to dashboard Cite

Path planning is a major problem in autonomous vehicles. In recent years, with the increase in applications of Unmanned Aerial Vehicles (UAVs), one of the main challenges is path planning, particularly in adversarial environments. In this paper, we consider the problem of planning a collision-free path for a UAV in a polygonal domain from a source point to a target point. Based on the characteristics of UAVs, we assume two basic limitations on the generated paths: an upper bound on the turning angle at each turning point (maximum turning angle) and a lower bound on the distance between two consecutive turns (minimum route leg length). We describe an algorithm that runs in O(n2 log n) time and finds a feasible path in accordance with the above limitations, where $n$ is the number of obstacle vertices. As shown by experiments, the output of the algorithm is much close to the shortest path with this requirement (always below about 10% of the shortest path), in a much smaller time (up to 10000 quicker, in our experiments). We further demonstrate how to decompose the algorithm into two phases, a preprocessing and a query phase. In this way, given a fixed start point and a set of obstacles, we can preprocess a data structure of size O(n2) in O(n2 log n) time, such that for any query target point we can find a path with the given requirements in O(n log n) time. Finally, we modify the algorithm to find a feasible (almost shortest) path that reaches the target point within a given range of directions.

show abstract

Heuristic Q-learning based on experience replay for three-dimensional path planning of the unmanned aerial vehicle

Cited by 17 publications

References 11 publications

UAV swarm path planning with reinforcement learning for field prospecting

UAV swarm path planning with reinforcement learning for field prospecting

UAV Path Planning Based on Multicritic-Delayed Deep Deterministic Policy Gradient

RCS: a fast path planning algorithm for Unmanned Aerial Vehicles

Contact Info

Product

Resources

About