2021
DOI: 10.1002/oca.2781
|View full text |Cite
|
Sign up to set email alerts
|

Path planning of mobile robot in unknown dynamic continuous environment using reward‐modified deep Q‐network

Abstract: The path planning problem of mobile robot in unknown dynamic environment (UDE) is discussed in this article by building a continuous dynamic simulation environment. To achieve a collision-free path in UDE, the reinforcement learning theory with deep Q-network (DQN) is applied for the mobile robot to learn optimal decisions. A reward function is designed with weight to balance the obstacle avoidance and the approach to the goal. Moreover, it is found that the relative motion between moving obstacles and robots … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4

Citation Types

0
14
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
7
1

Relationship

0
8

Authors

Journals

citations
Cited by 16 publications
(14 citation statements)
references
References 21 publications
0
14
0
Order By: Relevance
“…[ 20 ] for robot navigation, and the RMDDQN model proposed by Huang et al. [ 21 ] for path planning in unknown dynamic environments.…”
Section: Introductionmentioning
confidence: 99%
“…[ 20 ] for robot navigation, and the RMDDQN model proposed by Huang et al. [ 21 ] for path planning in unknown dynamic environments.…”
Section: Introductionmentioning
confidence: 99%
“…The evaluation function of DQN was improved by the correction function to increase the evaluation accuracy of the value function of the algorithm [20]. Deep reinforcement learning was applied to path planning of mobile robots in unknown dynamic environments [21], where targeting the problem of mutual collision triggered by abnormal rewards due to the relative motion of obstacles and robots, two reward thresholds were set to modify the abnormal rewards, thus realizing optimal decision learning. The experience playback mechanism was combined with Q-learning [22] and the soft actor-critic (SAC) algorithm [23], respectively, improving the sample utilization rate, the learning efficiency and convergence speed of the algorithm in three-dimensional (3D) path planning of UAVs and path planning of multi-arm robots.…”
Section: Introductionmentioning
confidence: 99%
“…A mobile wireless powertrain robot was able to determine the optimal path with the proposed method in terms of charging a large number of IoT devices. Huang et al [ 27 ] proposed a method that determines two reward thresholds for solving the anomalous reward problem encountered in the path planning process of a mobile robot in an unknown dynamic environment. The improvement in value-based DRL algorithms was experimentally demonstrated with the proposed method.…”
Section: Introductionmentioning
confidence: 99%
“…A mobile wireless powertrain robot was able to determine the optimal path with the proposed method in terms of charging a large number of IoT devices. (vi) Huang et al [27] proposed a method that determines two reward thresholds for solving the anomalous reward problem encountered in the path planning process of a mobile robot in an unknown dynamic environment. e improvement in value-based DRL algorithms was experimentally demonstrated with the proposed method.…”
Section: Introductionmentioning
confidence: 99%