Path planning of mobile robot in unknown dynamic continuous environment using reward‐modified deep <scp>Q</scp>‐network

Navigating robots through large-scale environments while avoiding dynamic obstacles is a crucial challenge in robotics. This study proposes an improved deep deterministic policy gradient (DDPG) path planning algorithm incorporating sequential linear path planning (SLP) to address this challenge. This research aims to enhance the stability and efficiency of traditional DDPG algorithms by utilizing the strengths of SLP and achieving a better balance between stability and real-time performance. Our algorithm generates a series of sub-goals using SLP, based on a quick calculation of the robot’s driving path, and then uses DDPG to follow these sub-goals for path planning. The experimental results demonstrate that the proposed SLP-enhanced DDPG path planning algorithm outperforms traditional DDPG algorithms by effectively navigating the robot through large-scale dynamic environments while avoiding obstacles. Specifically, the proposed algorithm improves the success rate by 12.33% compared to the traditional DDPG algorithm and 29.67% compared to the A*+DDPG algorithm in navigating the robot to the goal while avoiding obstacles.

show abstract

“…[ 20 ] for robot navigation, and the RMDDQN model proposed by Huang et al. [ 21 ] for path planning in unknown dynamic environments.…”

Section: Introductionmentioning

confidence: 99%

SLP-Improved DDPG Path-Planning Algorithm for Mobile Robot in Large-Scale Dynamic Environment

Chen

Liang

2023

Sensors

View full text Add to dashboard Cite

show abstract

“…The evaluation function of DQN was improved by the correction function to increase the evaluation accuracy of the value function of the algorithm [20]. Deep reinforcement learning was applied to path planning of mobile robots in unknown dynamic environments [21], where targeting the problem of mutual collision triggered by abnormal rewards due to the relative motion of obstacles and robots, two reward thresholds were set to modify the abnormal rewards, thus realizing optimal decision learning. The experience playback mechanism was combined with Q-learning [22] and the soft actor-critic (SAC) algorithm [23], respectively, improving the sample utilization rate, the learning efficiency and convergence speed of the algorithm in three-dimensional (3D) path planning of UAVs and path planning of multi-arm robots.…”

Section: Introductionmentioning

confidence: 99%

A DDQN Path Planning Algorithm Based on Experience Classification and Multi Steps for Mobile Robots

et al. 2022

View full text Add to dashboard Cite

Constrained by the numbers of action space and state space, Q-learning cannot be applied to continuous state space. Targeting this problem, the double deep Q network (DDQN) algorithm and the corresponding improvement methods were explored. First of all, to improve the accuracy of the DDNQ algorithm in estimating the target Q value in the training process, a multi-step guided strategy was introduced into the traditional DDQN algorithm, for which the single-step reward was replaced with the reward obtained in continuous multi-step interactions of mobile robots. Furthermore, an experience classification training method was introduced into the traditional DDQN algorithm, for which the state transition generated by the mobile robot–environment interaction was divided into two different types of experience pools, and experience pools were trained by the Q network, and the sampling proportions of the two experience pools were updated through the training loss. Afterward, the advantages of a multi-step guided DDQN (MS-DDQN) algorithm and experience classification DDQN (EC-DDQN) algorithm were combined to develop a novel experience classification multi-step DDQN (ECMS-DDQN) algorithm. Finally, the path planning of these four algorithms, including DDQN, MS-DDQN, EC-DDQN, and ECMS-DDQN, was simulated on the OpenAI Gym platform. The simulation results revealed that the ECMS-DDQN algorithm outperforms the other three in the total return value and generalization in path planning.

show abstract

“…A mobile wireless powertrain robot was able to determine the optimal path with the proposed method in terms of charging a large number of IoT devices. Huang et al [ 27 ] proposed a method that determines two reward thresholds for solving the anomalous reward problem encountered in the path planning process of a mobile robot in an unknown dynamic environment. The improvement in value-based DRL algorithms was experimentally demonstrated with the proposed method.…”

Section: Introductionmentioning

confidence: 99%

“…A mobile wireless powertrain robot was able to determine the optimal path with the proposed method in terms of charging a large number of IoT devices. (vi) Huang et al [27] proposed a method that determines two reward thresholds for solving the anomalous reward problem encountered in the path planning process of a mobile robot in an unknown dynamic environment. e improvement in value-based DRL algorithms was experimentally demonstrated with the proposed method.…”

Section: Introductionmentioning

confidence: 99%

Mobile Robot Application with Hierarchical Start Position DQN

Erkan

Arseri̇m

2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

Advances in deep learning significantly affect reinforcement learning, which results in the emergence of Deep RL (DRL). DRL does not need a data set and has the potential beyond the performance of human experts, resulting in significant developments in the field of artificial intelligence. However, because a DRL agent has to interact with the environment a lot while it is trained, it is difficult to be trained directly in the real environment due to the long training time, high cost, and possible material damage. Therefore, most or all of the training of DRL agents for real-world applications is conducted in virtual environments. This study focused on the difficulty in a mobile robot to reach its target by making a path plan in a real-world environment. The Minimalistic Gridworld virtual environment has been used for training the DRL agent, and to our knowledge, we have implemented the first real-world implementation for this environment. A DRL algorithm with higher performance than the classical Deep Q-network algorithm was created with the expanded environment. A mobile robot was designed for use in a real-world application. To match the virtual environment with the real environment, algorithms that can detect the position of the mobile robot and the target, as well as the rotation of the mobile robot, were created. As a result, a DRL-based mobile robot was developed that uses only the top view of the environment and can reach its target regardless of its initial position and rotation.

show abstract

Path planning of mobile robot in unknown dynamic continuous environment using reward‐modified deep Q‐network

Cited by 16 publications

References 21 publications

SLP-Improved DDPG Path-Planning Algorithm for Mobile Robot in Large-Scale Dynamic Environment

SLP-Improved DDPG Path-Planning Algorithm for Mobile Robot in Large-Scale Dynamic Environment

A DDQN Path Planning Algorithm Based on Experience Classification and Multi Steps for Mobile Robots

Mobile Robot Application with Hierarchical Start Position DQN

Contact Info

Product

Resources

About