2020
DOI: 10.48550/arxiv.2004.13465
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Nearly Optimal Regret for Stochastic Linear Bandits with Heavy-Tailed Payoffs

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2021
2021
2022
2022

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(9 citation statements)
references
References 11 publications
0
9
0
Order By: Relevance
“…Other variations with heavy-tailed losses are also studied in the literature, e.g., linear bandits (Medina & Yang, 2016;Xue et al, 2020), contextual bandits (Shao et al, 2018) and Lipschitz bandits (Lu et al, 2019). However, none of above algorithms removes the dependency on α.…”
Section: Algorithmmentioning
confidence: 99%
“…Other variations with heavy-tailed losses are also studied in the literature, e.g., linear bandits (Medina & Yang, 2016;Xue et al, 2020), contextual bandits (Shao et al, 2018) and Lipschitz bandits (Lu et al, 2019). However, none of above algorithms removes the dependency on α.…”
Section: Algorithmmentioning
confidence: 99%
“…The previous robust mean estimators (Bubeck et al, 2013), such as truncated empirical mean and median of means (Bubeck et al, 2013;Medina and Yang, 2016;Shao et al, 2018;Xue et al, 2020) which require the estimation of the mean, cannot effectively handle this super heavy-tailed noise. On the other hand, since the mean does not exist, we are also required to measure the performance of the agent with high-probability pseudo-regret (defined in Section 2.2).…”
Section: +mentioning
confidence: 99%
“…A line of recent work (Bubeck et al, 2013;Medina and Yang, 2016;Shao et al, 2018;Xue et al, 2020) on the heavy-tailed MAB or linear bandits uses the truncated empirical mean and median of means as the robust estimators. However, without assuming the finite moments of order 1 + for some ∈ (0, 1], it remains unclear whether one can attain equivalent regret/sample complexity for the more heavy-tailed setting, e.g, the noise of the payoff follows a Student's t-distribution, or whether sublinear regret algorithms of any form are even possible at all.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations