2021
DOI: 10.48550/arxiv.2110.02690
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Tuning Confidence Bound for Stochastic Bandits with Bandit Distance

Xinyu Zhang,
Srinjoy Das,
Ken Kreutz-Delgado

Abstract: We propose a novel modification of the standard upper confidence bound (UCB) method for the stochastic multi-armed bandit (MAB) problem which tunes the confidence bound of a given bandit based on its distance to others. Our UCB distance tuning (UCB-DT) formulation enables improved performance as measured by expected regret by preventing the MAB algorithm from focusing on non-optimal bandits which is a well-known deficiency of standard UCB. "Distance tuning" of the standard UCB is done using a proposed distance… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 5 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?