2022
DOI: 10.1088/1742-6596/2320/1/012002
|View full text |Cite
|
Sign up to set email alerts
|

The Determination of Reward Function in AGV Motion Control Based on DQN

Abstract: Motion control is a very important part in the field of AGV(Automated Guided Vehicle). A good motion control method can make the movement of AGV more stable. Network models of reinforcement learning is one of the methods to solve the problem of AGV in motion control. This paper introduces the Markov Decision Process and the role of reward function. Besides, it studies and analyzes several classic reinforcement learning cases. DQN(Deep Q-Learning Network) which belongs to deep reinforcement learning network mod… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(2 citation statements)
references
References 12 publications
0
2
0
Order By: Relevance
“…The robot can be trained by interacting with its environment and receiving feedback in the form of rewards or penalties based on the actions it takes. For example, the robot may receive a reward for successfully navigating to a particular location, while receiving a penalty for colliding with an obstacle [33][34][35][36][37]. Over time, the robot can use the feedback it receives to learn the optimal path to take in different environments.…”
Section: Related Workmentioning
confidence: 99%
“…The robot can be trained by interacting with its environment and receiving feedback in the form of rewards or penalties based on the actions it takes. For example, the robot may receive a reward for successfully navigating to a particular location, while receiving a penalty for colliding with an obstacle [33][34][35][36][37]. Over time, the robot can use the feedback it receives to learn the optimal path to take in different environments.…”
Section: Related Workmentioning
confidence: 99%
“…All the parameters in the training network are assigned to the target network after the training of a fixed number of steps. The algorithm sets an experience replay unit to reduce the correlation of training samples and improve the instability of the action value function of neural network approximation reinforcement learning [19]. A batch of samples are evenly selected from the experience library and mixed together with the training samples to break the correlation between adjacent training samples and improve the utilization rate of the samples during each training.…”
Section: Introductionmentioning
confidence: 99%