Stochastic parallel machine scheduling using reinforcement learning

Julaiti, Juxihong; Park, Kyu Tae; Das, Dyutimoy; Kumara, Soundar

doi:10.1002/amp2.10119

Cited by 7 publications

(1 citation statement)

References 37 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…More recently, deep Q-network-based schedulers were suggested for semiconductor manufacturing applications [21,22]. A deep deterministic policy gradient (DDPG)-based scheduler was also proposed to minimize weighted tardiness in the stochastic parallel machine scheduling problem [23]. Unlike [19,20], these studies developed multi-agent approaches where each agent considers the allocation of a job on a machine, and they successfully improved performances by reducing the learning complexity.…”

Section: Introductionmentioning

confidence: 99%

Deep Reinforcement Learning-Based Scheduler on Parallel Dedicated Machine Scheduling Problem towards Minimizing Total Tardiness

et al. 2023

View full text Add to dashboard Cite

This study considers a parallel dedicated machine scheduling problem towards minimizing the total tardiness of allocated jobs on machines. In addition, this problem comes under the category of NP-hard. Unlike classical parallel machine scheduling, a job is processed by only one of the dedicated machines according to its job type defined in advance, and a machine is able to process at most one job at a time. To obtain a high-quality schedule in terms of total tardiness for the considered scheduling problem, we suggest a machine scheduler based on double deep Q-learning. In the training phase, the considered scheduling problem is redesigned to fit into the reinforcement learning framework and suggest the concepts of state, action, and reward to understand the occurrences of setup, tardiness, and the statuses of allocated job types. The proposed scheduler, repeatedly finds better Q-values towards minimizing tardiness of allocated jobs by updating the weights in a neural network. Then, the scheduling performances of the proposed scheduler are evaluated by comparing it with the conventional ones. The results show that the proposed scheduler outperforms the conventional ones. In particular, for two datasets presenting extra-large scheduling problems, our model performs better compared to existing genetic algorithm by 12.32% and 29.69%.

show abstract