2020
DOI: 10.48550/arxiv.2006.03378
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Adaptation to the Range in $K$-Armed Bandits

Abstract: We consider stochastic bandit problems with K arms, each associated with a bounded distribution supported on the range [m, M ]. We do not assume that the range [m, M ] is known and show that there is a cost for learning this range. Indeed, a new trade-off between distribution-dependent and distribution-free regret bounds arises, which, for instance, prevents from simultaneously achieving the typical ln T and √T bounds. For instance, a √ T distribution-free regret bound may only be achieved if the distribution-… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 19 publications
(58 reference statements)
0
1
0
Order By: Relevance
“…Adaptive algorithms: There is a rich literature in deriving algorithms adaptive to the loss sequences, for either full information setting (Luo & Schapire, 2015;Orabona & Pal, 2016), stochastic bandits (Garivier & Cappé, 2011;Lattimore, 2015) or adversarial bandits (Wei & Luo, 2018;Bubeck et al, 2019). There are also many algorithms that is adaptive to the loss range, so-called 'scale-free' algorithms (De Rooij et al, 2014;Orabona & Pál, 2018;Hadiji & Stoltz, 2020). However, as mentioned above, to our knowledge, our work is the first to adapt to heavy-tail parameters.…”
Section: Algorithmmentioning
confidence: 97%
“…Adaptive algorithms: There is a rich literature in deriving algorithms adaptive to the loss sequences, for either full information setting (Luo & Schapire, 2015;Orabona & Pal, 2016), stochastic bandits (Garivier & Cappé, 2011;Lattimore, 2015) or adversarial bandits (Wei & Luo, 2018;Bubeck et al, 2019). There are also many algorithms that is adaptive to the loss range, so-called 'scale-free' algorithms (De Rooij et al, 2014;Orabona & Pál, 2018;Hadiji & Stoltz, 2020). However, as mentioned above, to our knowledge, our work is the first to adapt to heavy-tail parameters.…”
Section: Algorithmmentioning
confidence: 97%