2021
DOI: 10.48550/arxiv.2105.14260
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Understanding Bandits with Graph Feedback

Abstract: A. The bandit problem with graph feedback, proposed in [Mannor and Shamir, NeurIPS 2011], is modeled by a directed graph = ( , ) where is the collection of bandit arms, and once an arm is triggered, all its incident arms are observed. A fundamental question is how the structure of the graph affects the min-max regret. We propose the notions of the fractional weak domination number * and the -packing independence number capturing upper bound and lower bound for the regret respectively. We show that the two noti… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 23 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?