2021 International Conference on Machine Learning and Intelligent Systems Engineering (MLISE) 2021
DOI: 10.1109/mlise54096.2021.00049
|View full text |Cite
|
Sign up to set email alerts
|

Autonomous decision-making method of transportation process for flexible job shop scheduling problem based on reinforcement learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
2
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 10 publications
(2 citation statements)
references
References 13 publications
0
2
0
Order By: Relevance
“…Aside from this application, RL was used to solve the routing problem in a bidirectional transport network for the purpose of avoiding deadlocks and obtaining collision-free trajectories [ 26 ]. The deep Q network (DQN) was used in [ 27 ] to learn a transportation strategy with breakpoint continuation and hierarchical feedback, which can calculate and further modify a transportation schedule in a short time to accommodate dynamic factors. That aside, the authors of [ 28 ] tried to teach a neural network to allocate transportation duties to AGVs and design routes for them in accordance with the rewards computed by the network.…”
Section: Literature Reviewmentioning
confidence: 99%
“…Aside from this application, RL was used to solve the routing problem in a bidirectional transport network for the purpose of avoiding deadlocks and obtaining collision-free trajectories [ 26 ]. The deep Q network (DQN) was used in [ 27 ] to learn a transportation strategy with breakpoint continuation and hierarchical feedback, which can calculate and further modify a transportation schedule in a short time to accommodate dynamic factors. That aside, the authors of [ 28 ] tried to teach a neural network to allocate transportation duties to AGVs and design routes for them in accordance with the rewards computed by the network.…”
Section: Literature Reviewmentioning
confidence: 99%
“…The agent learns via experience and aims to maximize the rewards it receives in certain scenarios. The primary goal of RL is to optimize the cumulative reward obtained by an agent through the evaluation and selection of actions within a dynamic environment [14][15][16][17][18][19][20]. The most current development in artificial intelligence technology has allowed successful application Disclaimer/Publisher's Note: The statements, opinions, and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s).…”
Section: Introductionmentioning
confidence: 99%