Solving the optimal path planning of a mobile robot using improved Q-learning

Low, Ee Soong; Ong, Pauline; Cheah, Kah Chun

doi:10.1016/j.robot.2019.02.013

Cited by 242 publications

(123 citation statements)

References 37 publications

Supporting

Mentioning

Contrasting

Unclassified

Order By: Relevance

“…A concurrent grid-based implementation of a dynamic programming algorithm was presented in Reference [9]. In Reference [10], the flower pollination algorithm (FPA) was implemented as partially guided Q learning to solve a low convergence problem. The suggested technique implemented was a path planner for a three-wheel mobile robot.…”

Section: Introductionmentioning

confidence: 99%

Grid-Based Mobile Robot Path Planning Using Aging-Based Ant Colony Optimization Algorithm in Static and Dynamic Environments

Ajeil

Ibraheem

Azar

et al. 2020

Sensors

150

View full text Add to dashboard Cite

Planning an optimal path for a mobile robot is a complicated problem as it allows the mobile robots to navigate autonomously by following the safest and shortest path between starting and goal points. The present work deals with the design of intelligent path planning algorithms for a mobile robot in static and dynamic environments based on swarm intelligence optimization. A modification based on the age of the ant is introduced to standard ant colony optimization, called aging-based ant colony optimization (ABACO). The ABACO was implemented in association with grid-based modeling for the static and dynamic environments to solve the path planning problem. The simulations are run in the MATLAB environment to test the validity of the proposed algorithms. Simulations showed that the proposed path planning algorithms result in superior performance by finding the shortest and the most free-collision path under various static and dynamic scenarios. Furthermore, the superiority of the proposed algorithms was proved through comparisons with other traditional path planning algorithms with different static environments.

show abstract

Section: Introductionmentioning

confidence: 99%

Grid-Based Mobile Robot Path Planning Using Aging-Based Ant Colony Optimization Algorithm in Static and Dynamic Environments

Ajeil

Ibraheem

Azar

et al. 2020

Sensors

150

View full text Add to dashboard Cite

show abstract

“…The results also found that after reinforcement learning is added, the convergence time of robot path planning is increased by 13.54%. Low et al used the flower pollination algorithm to properly initialize the Q -value, which could speed up the convergence of mobile robots (Low et al, 2019 ). The principle is similar to reinforcement learning, therefore, the research results here are also supported.…”

Section: Discussionmentioning

confidence: 99%

The Path Planning of Mobile Robot by Neural Networks and Hierarchical Reinforcement Learning

Liao

2020

Front. Neurorobot.

View full text Add to dashboard Cite

Existing mobile robots cannot complete some functions. To solve these problems, which include autonomous learning in path planning, the slow convergence of path planning, and planned paths that are not smooth, it is possible to utilize neural networks to enable to the robot to perceive the environment and perform feature extraction, which enables them to have a fitness of environment to state action function. By mapping the current state of these actions through Hierarchical Reinforcement Learning (HRL), the needs of mobile robots are met. It is possible to construct a path planning model for mobile robots based on neural networks and HRL. In this article, the proposed algorithm is compared with different algorithms in path planning. It underwent a performance evaluation to obtain an optimal learning algorithm system. The optimal algorithm system was tested in different environments and scenarios to obtain optimal learning conditions, thereby verifying the effectiveness of the proposed algorithm. Deep Deterministic Policy Gradient (DDPG), a path planning algorithm for mobile robots based on neural networks and hierarchical reinforcement learning, performed better in all aspects than other algorithms. Specifically, when compared with Double Deep Q-Learning (DDQN), DDPG has a shorter path planning time and a reduced number of path steps. When introducing an influence value, this algorithm shortens the convergence time by 91% compared with the Q-learning algorithm and improves the smoothness of the planned path by 79%. The algorithm has a good generalization effect in different scenarios. These results have significance for research on guiding, the precise positioning, and path planning of mobile robots.

show abstract

“…In [24], Q-learning is used in combination with a Deep Deterministic Policy Gradients (DDPG) algorithm for a UAV to learn a landing task in simulation. In [25], the effectiveness of the Q-learning algorithm for robot path planning, is improved by using a flower pollinating algorithm to initialize the q-values of the algorithm.…”

Section: Cognitive Reasoningmentioning

confidence: 99%

“…are constraint averages for each of the variables in . The vector = 1 , 2 , … n , n = n V , represents the Lagrange multipliers, calculated for each variable in , using (25).…”

Section: The Set Of Variables Are Represented Bymentioning

confidence: 99%

A particle swarm optimization approach using adaptive entropy-based fitness quantification of expert knowledge for high-level, real-time cognitive robotic control

2019

View full text Add to dashboard Cite

High-level, real-time mission control of semi-autonomous robots, deployed in remote and dynamic environments, remains a challenge. Control models, learnt from a knowledgebase, quickly become obsolete when the environment or the knowledgebase changes. This research study introduces a cognitive reasoning process, to select the optimal action, using the most relevant knowledge from the knowledgebase, subject to observed evidence. The approach in this study introduces an adaptive entropy-based set-based particle swarm algorithm (AE-SPSO) and a novel, adaptive entropybased fitness quantification (AEFQ) algorithm for evidence-based optimization of the knowledge. The performance of the AE-SPSO and AEFQ algorithms are experimentally evaluated with two unmanned aerial vehicle (UAV) benchmark missions: (1) relocating the UAV to a charging station and (2) collecting and delivering a package. Performance is measured by inspecting the success and completeness of the mission and the accuracy of autonomous flight control. The results show that the AE-SPSO/AEFQ approach successfully finds the optimal state-transition for each mission task and that autonomous flight control is successfully achieved.

show abstract

Solving the optimal path planning of a mobile robot using improved Q-learning

Cited by 242 publications

References 37 publications

Grid-Based Mobile Robot Path Planning Using Aging-Based Ant Colony Optimization Algorithm in Static and Dynamic Environments

Grid-Based Mobile Robot Path Planning Using Aging-Based Ant Colony Optimization Algorithm in Static and Dynamic Environments

The Path Planning of Mobile Robot by Neural Networks and Hierarchical Reinforcement Learning

A particle swarm optimization approach using adaptive entropy-based fitness quantification of expert knowledge for high-level, real-time cognitive robotic control

Contact Info

Product

Resources

About