Research on Dynamic Path Planning of Mobile Robot Based on Improved DDPG Algorithm

Li, Peng; Ding, Xiangcheng; Sun, Hongfang; Zhao, S. J.; Cajo, Ricardo

doi:10.1155/2021/5169460

Cited by 16 publications

(15 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The information in the environment (obstacle location, shape, and orientation) is completely unknown before the mobile robot performs a path-planning task, and only the starting and goal points are known. Mobile robots need to plan the shortest path length from the starting point to the goal point without colliding with obstacles in the environment [ 30 ].…”

Section: Preliminariesmentioning

confidence: 99%

“… Q-learning requires a certain memory to store the tracking Q-table. When MR has m states and n actions, the dimension of the constituted Q table is m∗n , and by choosing the maximum Q value to determine the next move's direction, a total of m∗ ( n − 1) times need to be compared, with the more complex state space and actions, which will exponentially increase the amount of computation, resulting in long computation time [ 30 ]. It will lead to local optimum when the environment is complex, and it is easy to fall into a dead-end path blocked by the obstacles.…”

Section: Preliminariesmentioning

confidence: 99%

See 1 more Smart Citation

A Path-Planning Approach Based on Potential and Dynamic Q-Learning for Mobile Robots in Unknown Environment

Hao

Zhao

et al. 2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

The path-planning approach plays an important role in determining how long the mobile robots can travel. To solve the path-planning problem of mobile robots in an unknown environment, a potential and dynamic Q-learning (PDQL) approach is proposed, which combines Q-learning with the artificial potential field and dynamic reward function to generate a feasible path. The proposed algorithm has a significant improvement in computing time and convergence speed compared to its classical counterpart. Experiments undertaken on simulated maps confirm that the PDQL when used for the path-planning problem of mobile robots in an unknown environment outperforms the state-of-the-art algorithms with respect to two metrics: path length and turning angle. The simulation results show the effectiveness and practicality of the proposal for mobile robot path planning.

show abstract

Section: Preliminariesmentioning

confidence: 99%

Section: Preliminariesmentioning

confidence: 99%

A Path-Planning Approach Based on Potential and Dynamic Q-Learning for Mobile Robots in Unknown Environment

Hao

Zhao

et al. 2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

show abstract

“…In order to fully evaluate the performance of the prioritized experience replay algorithm proposed in this paper, it is integrated into DDPG and compared with the algorithms proposed by Cicek [23] , Xu [25] , Cao [26] , and Li [27] , respectively. In the simulation environment shown in Fig.…”

Section: Comparison With Other Algorithmsmentioning

confidence: 99%

“…Cao et al [26] integrated TD-error, Q-value and data volume, focused on different importance indicators in different training stages of the neural network, and dynamically adjusted the weight of each indicator to achieve an adaptive experience importance estimation. Li et al [27] introduced internal curiosity module (ICM) to provide internal rewards for the training process of the robot, which were combined with external rewards provided by environmental feedback, and then introduced prioritized experience replay and transfer learning to improve the success rate and convergence…”

Section: Introductionmentioning

confidence: 99%

Prioritized experience replay in DDPG via multi-dimensional transition priorities calculation

Cheng

Wang

Zhang

et al. 2022

Preprint

View full text Add to dashboard Cite

The path planning algorithm of intelligent robot based on DDPG uses uniform random experience replay mechanism, cannot distinguish the importance of experience samples to the algorithm training process, and has some problems, such as unreasonable sampling of experience transitions and excessive use of edge experience, which lead to slow convergence speed and low success rate of path planning. In this paper, The priorities of experience transition based on the immediate reward, temporal-difference (TD) error and the loss function of Actor network are calculated respectively, and the information entropy is used as the weight to fuse the three priorities as the final priority. Furthermore, in order to effectively use the positive experience transitions and ensure the diversity of experience transitions, a method of increasing and decreasing the priority of positive experience transition is proposed. Finally, the sampling probability is calculated according to the priority of experience transition. The experimental results show that our proposed prioritized experience replay can not only improve the utilization rate of experience transitions and accelerate the convergence speed of DDPG, but also effectively improve the success rate of path planning, so as to provide a better guarantee for the robot to safely reach the target point.

show abstract

“…In this method, a mixed noise along with a more reasonable reward function was used for quick training. The proposed algorithm and DDPG-based algorithms [ 23 , 24 ] were experimentally compared and the advantages of the proposed algorithm were demonstrated in a complex environment in terms of exploration efficiency, optimum path and time. In Zhou et al [ 7 ], an improved DQN algorithm was proposed for the path planning problem of patrolling robots.…”

Section: Introductionmentioning

confidence: 99%

Mobile Robot Application with Hierarchical Start Position DQN

Erkan

Arseri̇m

2022

Computational Intelligence and Neuroscience

View full text Add to dashboard Cite

Advances in deep learning significantly affect reinforcement learning, which results in the emergence of Deep RL (DRL). DRL does not need a data set and has the potential beyond the performance of human experts, resulting in significant developments in the field of artificial intelligence. However, because a DRL agent has to interact with the environment a lot while it is trained, it is difficult to be trained directly in the real environment due to the long training time, high cost, and possible material damage. Therefore, most or all of the training of DRL agents for real-world applications is conducted in virtual environments. This study focused on the difficulty in a mobile robot to reach its target by making a path plan in a real-world environment. The Minimalistic Gridworld virtual environment has been used for training the DRL agent, and to our knowledge, we have implemented the first real-world implementation for this environment. A DRL algorithm with higher performance than the classical Deep Q-network algorithm was created with the expanded environment. A mobile robot was designed for use in a real-world application. To match the virtual environment with the real environment, algorithms that can detect the position of the mobile robot and the target, as well as the rotation of the mobile robot, were created. As a result, a DRL-based mobile robot was developed that uses only the top view of the environment and can reach its target regardless of its initial position and rotation.

show abstract

Research on Dynamic Path Planning of Mobile Robot Based on Improved DDPG Algorithm

Cited by 16 publications

References 20 publications

A Path-Planning Approach Based on Potential and Dynamic Q-Learning for Mobile Robots in Unknown Environment

A Path-Planning Approach Based on Potential and Dynamic Q-Learning for Mobile Robots in Unknown Environment

Prioritized experience replay in DDPG via multi-dimensional transition priorities calculation

Mobile Robot Application with Hierarchical Start Position DQN

Contact Info

Product

Resources

About