2020
DOI: 10.1007/978-3-030-59747-4_38
|View full text |Cite
|
Sign up to set email alerts
|

Deep Reinforcement Learning and Optimization Approach for Multi-echelon Supply Chain with Uncertain Demands

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3
1

Relationship

1
6

Authors

Journals

citations
Cited by 17 publications
(15 citation statements)
references
References 6 publications
0
15
0
Order By: Relevance
“…The goal is to meet uncertain customer demands in the last echelon nodes while minimizing all incurred costs (operation costs, such as production at suppliers, stock, transport, processing; and penalization costs: if demand is not met, or a stock capacity is exceeded). We have built upon our previous work (Alves and Mateus 2020) adding uncertain seasonal demands, stochastic lead times, and manufacturers' capacities. The formalization of the problem, an MDP formulation and an NLP model, have been extended to take account of changes in the problem.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…The goal is to meet uncertain customer demands in the last echelon nodes while minimizing all incurred costs (operation costs, such as production at suppliers, stock, transport, processing; and penalization costs: if demand is not met, or a stock capacity is exceeded). We have built upon our previous work (Alves and Mateus 2020) adding uncertain seasonal demands, stochastic lead times, and manufacturers' capacities. The formalization of the problem, an MDP formulation and an NLP model, have been extended to take account of changes in the problem.…”
Section: Discussionmentioning
confidence: 99%
“…Formulating an MDP means defining the states, actions, rewards, and environment's dynamics (or transition function) to be used to solve the problem. The formulation presented here is an extension of our previous work (Alves and Mateus 2020) to include uncertain seasonal demands and lead times, and processing capacities.…”
Section: Mdp Formulationmentioning
confidence: 99%
See 1 more Smart Citation
“…However, it must be assumed that the complex structure of current SCs, especially global ones with many stages and nodes, the number of variables included in the modeled problem and its intrinsically stochastic condition imply that the modeling of real cases with the reinforcement learning methodology, but without the additional assistance of other methods, constitutes a considerable challenge. Only through the gradual incorporation of the DRL methodology [69], a combination of the reinforcement learning methodology with deep learning-another ML methodology that uses artificial neural networks to transform a set of inputs into a set of outputs, that solve tasks that involve handling complex and high-dimensional raw input data sets [91]-has it been possible to begin to consider the study of SCs with certain complexity, e.g.,: (i) the multistage SC problem of Alves and Mateus [67], validated with a four-stage SC scenario and two nodes per stage, local inventories, lead time, a single product, and demand uncertainty; (ii) the capacitated SC problem of Peng et al [68], validated with a three-stage SC scenario, one node in the first, two in the second and three in the last stage, capacitated production, independent, stochastic and seasonal demand, and a single product; (iii) the case of Meisheri et al [92] who, despite restricting the validation of their retailers' inventory replenishment to the last SC layers, i.e., warehouse and retailer, considers the existence of product variety, with instances of 100 and 220 products-to substantially increase combinatorial computation-and incorporates lead time, limited storage capacity, cross-product restrictions, and weight and volume transportation restrictions. Computational limitations in this regard are manifested as the size of the problem to be solved in terms of the size of the input dataset, and especially the size of the modeled problem's observation space.…”
Section: Discussionmentioning
confidence: 99%
“…Of those dealing with planning at the tactical decision level, most focus on either inventory replenishment or, to a lesser extent, dynamic supplier selection problems. Alves and Mateus [67] consider a DRL approach based on an improved version of the proximal policy optimization algorithm (PPO), called PPO2, to solve the inventory problem of a four-step SC with two nodes per step and stochastic demands. The optimization approach for Peng et al [68] is similar, but the modeled problem considers a simpler SC composed of three stages-plant, plant warehouse and retailer-subject to independent, stochastic and seasonal demand.…”
Section: Content Analysismentioning
confidence: 99%