2020
DOI: 10.48550/arxiv.2002.09808
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

My Fair Bandit: Distributed Learning of Max-Min Fairness with Multi-player Bandits

Abstract: Consider N cooperative but non-communicating players where each plays one out of M arms for T turns. Players have different utilities for each arm, representable as an N × M matrix. These utilities are unknown to the players. In each turn players receive noisy observations of their utility for their selected arm. However, if any other players selected the same arm that turn, they will all receive zero utility due to the conflict. No other communication or coordination between the players is possible. Our goal … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 16 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?