2022
DOI: 10.1109/tac.2021.3077454
|View full text |Cite
|
Sign up to set email alerts
|

Multiplayer Bandits: A Trekking Approach

Abstract: We study stochastic multi-armed bandits with many players. The players do not know the number of players, cannot communicate with each other and if multiple players select a common arm they collide and none of them receive any reward. We consider the static scenario, where the number of players remains fixed, and the dynamic scenario, where the players enter and leave at any time. We provide algorithms based on a novel 'trekking approach' that guarantees constant regret for the static case and sub-linear regre… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 11 publications
(7 citation statements)
references
References 23 publications
0
7
0
Order By: Relevance
“…Our Trekking for Static Network (TSN) for static and its variant Trekking for Dynamic Network (TDN) in [4] overcomes the above drawbacks. As opposed to existing algorithms which separate estimation and orthogonalization tasks, we show that users can be settled on the top N channel without knowing N. Specifically, TSN and TDN are based on novel trekking approach where users operating on a channel always looks to operate on a vacant channel with better quality.…”
Section: Homogeneous Network: Static and Dynamicmentioning
confidence: 99%
See 2 more Smart Citations
“…Our Trekking for Static Network (TSN) for static and its variant Trekking for Dynamic Network (TDN) in [4] overcomes the above drawbacks. As opposed to existing algorithms which separate estimation and orthogonalization tasks, we show that users can be settled on the top N channel without knowing N. Specifically, TSN and TDN are based on novel trekking approach where users operating on a channel always looks to operate on a vacant channel with better quality.…”
Section: Homogeneous Network: Static and Dynamicmentioning
confidence: 99%
“…The epoch-free TDN algorithm avoids such regret and hence significantly outperforms others. Additional simulation results considering various scenarios for static and dynamic networks are given in [4].…”
Section: Homogeneous Network: Static and Dynamicmentioning
confidence: 99%
See 1 more Smart Citation
“…Related Work: As a well-known model to capture the exploration-exploitation dilemma in decision-making, multiarmed bandits have attracted extensive attention (see [13] for a survey). A variety of generalized bandit problems have been investigated, where the situations with non-stationary reward functions [14], restless arms [15], satisficing reward objectives [16], heavy-tailed reward distributions [17], risk-averse decision-makers [18], and multiple players [19] are considered. Distributed algorithms have also been proposed to tackle bandit problems (e.g., see [20]- [27]).…”
Section: Introductionmentioning
confidence: 99%
“…The Upper Confidence Bound (UCB) algorithm and its variants have proven their strength in tackling multi-armed bandit problems (e.g., see [5], [6]). Various generalizations of the classical bandit problem have been studied, in which nonstationary reward functions [7], [8], restless arms [9], satisficing reward objectives [10], risk-averse decision-makers [11], heavy-tailed reward distributions [12], and multiple players [13] are considered. Recently, increasing attention has been also paid to tackling bandit problems in a distributed fashion (e.g., see [14]- [17]).…”
Section: Introductionmentioning
confidence: 99%