2022
DOI: 10.1609/aaai.v36i11.21523
|View full text |Cite
|
Sign up to set email alerts
|

Bayesian Model-Based Offline Reinforcement Learning for Product Allocation

Abstract: Product allocation in retail is the process of placing products throughout a store to connect consumers with relevant products. Discovering a good allocation strategy is challenging due to the scarcity of data and the high cost of experimentation in the physical world. Some work explores Reinforcement learning (RL) as a solution, but these approaches are often limited because of the sim2real problem. Learning policies from logged trajectories of a system is a key step forward for RL in physical systems. Recent… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(1 citation statement)
references
References 21 publications
0
1
0
Order By: Relevance
“…Off-policy evaluation (OPE) allows one to estimate the goodness of a policy (often referred to as target/candidate policy) using data collected from another, possibly unrelated policy (referred to as behavior policy). Such evaluation is important because testing and implementing a policy in the real world can be costly in areas like trading (Liu et al 2020) and physical retail (Jenkins et al 2022(Jenkins et al , 2020, even vital in situations like healthcare (Liao et al 2020) and transportation (Du et al 2023;Vlachogiannis et al 2023;Li et al 2023).…”
Section: Introductionmentioning
confidence: 99%
“…Off-policy evaluation (OPE) allows one to estimate the goodness of a policy (often referred to as target/candidate policy) using data collected from another, possibly unrelated policy (referred to as behavior policy). Such evaluation is important because testing and implementing a policy in the real world can be costly in areas like trading (Liu et al 2020) and physical retail (Jenkins et al 2022(Jenkins et al , 2020, even vital in situations like healthcare (Liao et al 2020) and transportation (Du et al 2023;Vlachogiannis et al 2023;Li et al 2023).…”
Section: Introductionmentioning
confidence: 99%