2021
DOI: 10.1109/jsait.2021.3076027
|View full text |Cite
|
Sign up to set email alerts
|

On No-Sensing Adversarial Multi-Player Multi-Armed Bandits With Collision Communications

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
5
3

Relationship

2
6

Authors

Journals

citations
Cited by 9 publications
(7 citation statements)
references
References 23 publications
0
7
0
Order By: Relevance
“…Decentralized multi-player multi-armed bandit (Dec-MPMAB) problems [24] extend the traditional multi-armed bandit (MAB) framework to encompass situations where multiple players interact with a shared set of arms or actions. In the Dec-MPMAB framework, multiple players engage in decision making simultaneously, and they may either compete or cooperate in the allocation of limited resources.…”
Section: Problem Formulationmentioning
confidence: 99%
“…Decentralized multi-player multi-armed bandit (Dec-MPMAB) problems [24] extend the traditional multi-armed bandit (MAB) framework to encompass situations where multiple players interact with a shared set of arms or actions. In the Dec-MPMAB framework, multiple players engage in decision making simultaneously, and they may either compete or cooperate in the allocation of limited resources.…”
Section: Problem Formulationmentioning
confidence: 99%
“…The first class allows no information sharing among players, where players sense the presence of other players through experienced collisions (Anandkumar et al 2011). The other class allows information sharing among players, e.g., directly sharing estimated mean rewards of arms (Liu and Zhao 2010b;Kalathil, Nayyar, and Jain 2014;Rosenski, Shamir, and Szlak 2016;Bistritz and Leshem 2018;Besson and Kaufmann 2018;Boursier and Perchet 2019;Mehrabian et al 2020;Wang et al 2020;Bubeck et al 2020;Lugosi and Mehrabian 2021;Hanawal and Darak 2021;Pacchiano, Bartlett, and Jordan 2021;Shi et al 2020). In particular, the regret guarantees for MPMAB were significantly improved in (Boursier and Perchet 2019) compared to the non-information sharing case.…”
Section: Related Workmentioning
confidence: 99%
“…Specifically, we study a "networked information sharing" setting, where all players are arranged in a network G := {N , E}, and each player has limited capacity for sharing information, e.g., their estimates of the arms' mean rewards with her neighbors in G, as inspired by the original idea of utilizing collisions to share sampled arm rewards in MPMAB settings (Boursier and Perchet 2019; Shi et al 2020). To tackle the new dilemma in the presence of walking arms, we present a decentralized algorithm called MPMAB-WA-UCB, which is able to avoid collisions after sufficient exploration, in a decentralized manner, i.e., each player decides which arm to pull independently based on the local available information: the past observed rewards and collisions, along with the received information from neighbor players.…”
Section: Introductionmentioning
confidence: 99%
“…Second, other than the stochastic reward setting, other MAB variants are also worth exploring in the case of multiple players. For example, adversarial rewards are studied in both collision-sensing [26] and no-sensing [42], [43] setting, which is an interesting future direction for the collision-dependent reward model. Third, it is interesting but also challenging to remove Assumption 1 in general.…”
Section: G Other Extensionsmentioning
confidence: 99%