2019
DOI: 10.1287/opre.2018.1832
|View full text |Cite
|
Sign up to set email alerts
|

MNL-Bandit: A Dynamic Learning Approach to Assortment Selection

Abstract: We consider a dynamic assortment selection problem where in every round the retailer offers a subset (assortment) of N substitutable products to a consumer, who selects one of these products according to a multinomial logit (MNL) choice model. The retailer observes this choice, and the objective is to dynamically learn the model parameters while optimizing cumulative revenues over a selling horizon of length T. We refer to this exploration–exploitation formulation as the MNL-Bandit problem. Existing methods fo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
124
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 117 publications
(127 citation statements)
references
References 26 publications
3
124
0
Order By: Relevance
“…The first boundedness assumption on revenue parameters is standard in the literature (see e.g., Theorem 1 in Agrawal et al. (2019)). It is also worthwhile noting that assumption (A2) is weaker than the common assumption that no purchase (with V0=1) is the most frequent outcome.…”
Section: Model Specifications and Assortment Space Reductionsmentioning
confidence: 99%
See 2 more Smart Citations
“…The first boundedness assumption on revenue parameters is standard in the literature (see e.g., Theorem 1 in Agrawal et al. (2019)). It is also worthwhile noting that assumption (A2) is weaker than the common assumption that no purchase (with V0=1) is the most frequent outcome.…”
Section: Model Specifications and Assortment Space Reductionsmentioning
confidence: 99%
“…Instead, we directly estimate “nested‐level utility” Vi(·)γi (see Equation ) and the discussions above Equation ). UCB policy: We propose an upper confidence bound (UCB) algorithm using an epoch‐based strategy from Agrawal et al. (2019), which leads to a worst‐case expected regret of Ofalse~false(MNTfalse). Although the UCB has been a well‐known technique for bandit problems, adopting this high‐level idea to solve a problem with specific structures certainly requires technical innovations (e.g., how to build a confidence bound on a carefully designed parameter, see Lemma 4).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Also, in the field of dynamic assortment with discrete choice models. Most of the underlying choice model is usually the multinomial logit model [1,2,5]. Recently, Chen et al [12] study the dynamic assortment problem under the nested logit model.…”
Section: Literature Reviewmentioning
confidence: 99%
“…is the distance sensitivity of consumer i. Then, the multinomial logit (MNL) model [19] was introduced to represent the random selection of consumers. The probabilities for consumer i to choose "buy in store" and "buy online, in-store pickup" 0 can be respectively expressed as:…”
Section: Scene-based Demand Analysis For Cssmentioning
confidence: 99%