2020
DOI: 10.48550/arxiv.2008.13629
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Statistically Robust, Risk-Averse Best Arm Identification in Multi-Armed Bandits

Abstract: Traditional multi-armed bandit (MAB) formulations usually make certain assumptions about the underlying arms' distributions, such as bounds on the support or their tail behaviour. Moreover, such parametric information is usually 'baked' into the algorithms. In this paper, we show that specialized algorithms that exploit such parametric information are prone to inconsistent learning performance when the parameter is misspecified. Our key contributions are twofold: (i) We establish fundamental performance limits… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 16 publications
(34 reference statements)
0
1
0
Order By: Relevance
“…Upper confidence bound algorithms in this context are studied by Maillard (2013), Cassel et al (2018), Khajonchotpanya et al (2021). Alternative arm selection approaches in the context of risk-averse bandits include the max-min approach discussed in Galichet et al (2013), the successive rejects relying on concentration bound guarantees of Kolla et al (2019a), robust estimation-based algorithms in Kagrecha et al (2020), or Thompson Sampling approaches in Chang et al (2020) and Baudry et al (2021).…”
Section: Introductionmentioning
confidence: 99%
“…Upper confidence bound algorithms in this context are studied by Maillard (2013), Cassel et al (2018), Khajonchotpanya et al (2021). Alternative arm selection approaches in the context of risk-averse bandits include the max-min approach discussed in Galichet et al (2013), the successive rejects relying on concentration bound guarantees of Kolla et al (2019a), robust estimation-based algorithms in Kagrecha et al (2020), or Thompson Sampling approaches in Chang et al (2020) and Baudry et al (2021).…”
Section: Introductionmentioning
confidence: 99%