2022
DOI: 10.3390/pr10040760
|View full text |Cite
|
Sign up to set email alerts
|

Deep Reinforcement Learning for Dynamic Flexible Job Shop Scheduling with Random Job Arrival

Abstract: The production process of a smart factory is complex and dynamic. As the core of manufacturing management, the research into the flexible job shop scheduling problem (FJSP) focuses on optimizing scheduling decisions in real time, according to the changes in the production environment. In this paper, deep reinforcement learning (DRL) is proposed to solve the dynamic FJSP (DFJSP) with random job arrival, with the goal of minimizing penalties for earliness and tardiness. A double deep Q-networks (DDQN) architectu… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
31
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
3

Relationship

1
5

Authors

Journals

citations
Cited by 71 publications
(31 citation statements)
references
References 31 publications
0
31
0
Order By: Relevance
“…A mathematical model of the MODFJSP with random job arrival in a smart machine tool processing workshop was established. All notations used to describe the problem are summarized in Table 2 in line with the literature [18]. For each job J i , its due date D i can be calculated as…”
Section: Mathematical Modelmentioning
confidence: 99%
See 4 more Smart Citations
“…A mathematical model of the MODFJSP with random job arrival in a smart machine tool processing workshop was established. All notations used to describe the problem are summarized in Table 2 in line with the literature [18]. For each job J i , its due date D i can be calculated as…”
Section: Mathematical Modelmentioning
confidence: 99%
“…In addition, the differences among action values for s t are often very small relative to the magnitude of Q. For example, after training with the DDQN in [18], the average gap between the Q values of the best and the second-best action across visited states is roughly 0.06, whereas the average action value across those states is about 17. Action values are frequently reordered, and actions chosen by behavior strategies are correspondingly changed, which brings small amounts of noise in the updates.…”
Section: Proposed Hrl 41 Background Of Ddqns and Dddqnsmentioning
confidence: 99%
See 3 more Smart Citations