2020
DOI: 10.1007/s11740-020-01000-8
|View full text |Cite
|
Sign up to set email alerts
|

A deep q-learning-based optimization of the inventory control in a linear process chain

Abstract: Due to growing globalized markets and the resulting globalization of production networks across different companies, inventory and order optimization is becoming increasingly important in the context of process chains. Thus, an adaptive and continuously self-optimizing inventory control on a global level is necessary to overcome the resulting challenges. Advances in sensor and communication technology allow companies to realize a global data exchange to achieve a holistic inventory control. Based on deep q-lea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
3
0
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 14 publications
(4 citation statements)
references
References 20 publications
0
3
0
1
Order By: Relevance
“…The application of reinforcement learning methods in supply chain management mainly focuses on optimizing decision-making problems, such as inventory management, logistics scheduling and pricing strategies. Mainly include Markov decision process (MDP) (Oroojlooyjadid, 2022), Q-learning and policy gradient methods (Dittrich, 2021). Markov Decision Process (MDP) is a framework for modeling supply chain management problems in which the state of the system changes over time and the decision maker selects an action based on the current state and receives a reward or cost based on the action.…”
Section: Reinforcement Learning Methodsmentioning
confidence: 99%
“…The application of reinforcement learning methods in supply chain management mainly focuses on optimizing decision-making problems, such as inventory management, logistics scheduling and pricing strategies. Mainly include Markov decision process (MDP) (Oroojlooyjadid, 2022), Q-learning and policy gradient methods (Dittrich, 2021). Markov Decision Process (MDP) is a framework for modeling supply chain management problems in which the state of the system changes over time and the decision maker selects an action based on the current state and receives a reward or cost based on the action.…”
Section: Reinforcement Learning Methodsmentioning
confidence: 99%
“…One of the key challenges in this field is the implementation of Due to the generalization and applicability of reinforcement learning algorithms, they are increasingly being applied to inventory management problems. Examples include Deep Q-Network (DQN) [50], QMIX [51], QTRAN [52], IPPO and MAPPO [53], and CD-PPO [54]. These RL algorithms demonstrate promising capabilities for addressing inventory management challenges and may provide performance improvement and better adaptability.…”
Section: Introductionmentioning
confidence: 99%
“…Um Q-learning profundo foi desenvolvido por[Dittrich and Fohlmeister 2021], que utilizou como base um método para um controle de estoque auto otimizado. Neste método, o processo de decisão é baseado em uma rede neural artificial.…”
unclassified