Reinforcement Learning for Logistics and Supply Chain Management: Methodologies, State of the Art, and Future Opportunities

Yan, Yimo; Chow, Ahf; Ho, Chin Pang; Kuo, Yong-Hong; Wu, Qihao; Ying, Chengshuo

doi:10.2139/ssrn.3935816

Cited by 3 publications

(1 citation statement)

References 101 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Some of them were based on the fundamental principles of Markov Decision Processes (MDP) and their use in Supply Chain Management (Giannoccaro and Pontrandolfo, 2002); others directly followed early RL formulations such as Q-learning for Business Process Management (Huang, van der Aalst, Lu, and Duan, 2011). Numerous publications describe attempts to apply different RL tools for constrained task scheduling and packing problems (Jędrzejowicz and Ratajczak-Ropel, 2013;Mao, Alizadeh, Menache, and Kandula, 2016) and logistics (Yan et al, 2021;Yuan, Li, and Ji, 2021).…”

Section: Previous Work On Reinforcement Learning Applicationsmentioning

confidence: 99%

Utilization of Deep Reinforcement Learning for Discrete Resource Allocation Problem in Project Management – a Simulation Experiment

Wójcik¹

2022

Informatyka Ekonomiczna

View full text Add to dashboard Cite

This paper tests the applicability of deep reinforcement learning (DRL) algorithms to simulated problems of constrained discrete and online resource allocation in project management. DRL is an extensively researched method in various domains, although no similar case study was found when writing this paper. The hypothesis was that a carefully tuned RL agent could outperform an optimisation-based solution. The RL agents: VPG, AC, and PPO, were compared against a classic constrained optimisation algorithm in trials: "easy"/"moderate"/"hard" (70/50/30% average project success rate). Each trial consisted of 500 independent, stochastic simulations. The significance of the differences was checked using a Welch ANOVA on significance level alpha = 0.01, followed by post hoc comparisons for false-discovery control. The experiment revealed that the PPO agent performed significantly better in moderate and hard simulations than the optimisation approach and other RL methods.

show abstract

Section: Previous Work On Reinforcement Learning Applicationsmentioning

confidence: 99%

Utilization of Deep Reinforcement Learning for Discrete Resource Allocation Problem in Project Management – a Simulation Experiment

Wójcik¹

2022

Informatyka Ekonomiczna

View full text Add to dashboard Cite

show abstract

Enhancing Robotic Autonomy and Deep Reinforcement Learning Applications

Goel,

Singla

2024

Robo-Advisors in Management

View full text Add to dashboard Cite

The integration of Deep Reinforcement Learning (DRL) into the realm of robotics and autonomous systems has emerged as a groundbreaking paradigm shift, empowering machines to tackle intricate tasks through interaction with their environments. This chapter offers a comprehensive examination of the current research landscape at the intersection of DRL and robotics within this dynamic field. This chapter navigates through the conceptualization of DRL and explores its diverse applications in controlling robotics and object manipulation. The chapter showcases the autonomy and adaptability enabled by DRL while addressing prevalent challenges such as sample efficiency, safety concerns, and scalability. In conclusion, this chapter serves as a valuable resource for future researchers and practitioners intrigued by the intersection of DRL and robotics. It synthesizes current knowledge, underscores significant progress made, and maps out exciting avenues for further exploration, ultimately propelling the advancement of robotic systems in the era of machine learning and artificial intelligence.

show abstract

Solving an Order Batching and Sequencing Problem with Reinforcement Learning

Canaslan,

Gülcü

2024

International Journal of Advances in Engineering and Pure Sciences

View full text Add to dashboard Cite

The purpose of this research is to determine whether a DRL solution would be a suitable solution for the OBSP problem and to compare it with traditional methods. For this purpose, models trained with the PPO algorithm were tested in a complex and realistic warehouse environment, and an attempt was made to measure whether a strategy was developed to decrease the number of orders being late. A heuristic method was also applied and the results were compared on the same environment and data. The results showed that DRL approach that combines heuristics with the PPO algorithm has a better performance than the heuristics in minimizing the tardy order percentage in all tested scenarios.

show abstract

Reinforcement Learning for Logistics and Supply Chain Management: Methodologies, State of the Art, and Future Opportunities

Cited by 3 publications

References 101 publications

Utilization of Deep Reinforcement Learning for Discrete Resource Allocation Problem in Project Management – a Simulation Experiment

Utilization of Deep Reinforcement Learning for Discrete Resource Allocation Problem in Project Management – a Simulation Experiment

Enhancing Robotic Autonomy and Deep Reinforcement Learning Applications

Solving an Order Batching and Sequencing Problem with Reinforcement Learning

Contact Info

Product

Resources

About