2020
DOI: 10.48550/arxiv.2011.09998
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Fully Gap-Dependent Bounds for Multinomial Logit Bandit

Jiaqi Yang

Abstract: We study the multinomial logit (MNL) bandit problem, where at each time step, the seller offers an assortment of size at most K from a pool of N items, and the buyer purchases an item from the assortment according to a MNL choice model. The objective is to learn the model parameters and maximize the expected revenue. We present (i) an algorithm that identifies the optimal assortment S * within O( N i=1 ∆ −2 i ) time steps with high probability, and (ii) an algorithm that incurs O( i / ∈S * K∆ −1 i log T ) regr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 12 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?