2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS) 2019
DOI: 10.1109/focs.2019.00022
|View full text |Cite
|
Sign up to set email alerts
|

Adversarial Bandits with Knapsacks

Abstract: We consider Bandits with Knapsacks (henceforth, BwK), a general model for multi-armed bandits under supply/budget constraints. In particular, a bandit algorithm needs to solve a well-known knapsack problem: find an optimal packing of items into a limited-size knapsack. The BwK problem is a common generalization of numerous motivating examples, which range from dynamic pricing to repeated auctions to dynamic ad allocation to network routing and scheduling. While the prior work on BwK focused on the stochastic v… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
45
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
7
3

Relationship

0
10

Authors

Journals

citations
Cited by 32 publications
(45 citation statements)
references
References 72 publications
0
45
0
Order By: Relevance
“…Agrawal and Devanur (2014) generalized the BwK model by allowing arbitrary concave rewards and convex constraints. Furthermore, similar constrained bandit problems are also studied in settings that includes contextual bandits (Agrawal and Devanur, 2014;Wu et al, 2015;Agrawal and Devanur, 2016) and even adversarial bandits (Sun et al, 2017;Immorlica et al, 2019).…”
Section: Dealing With Constraintsmentioning
confidence: 99%
“…Agrawal and Devanur (2014) generalized the BwK model by allowing arbitrary concave rewards and convex constraints. Furthermore, similar constrained bandit problems are also studied in settings that includes contextual bandits (Agrawal and Devanur, 2014;Wu et al, 2015;Agrawal and Devanur, 2016) and even adversarial bandits (Sun et al, 2017;Immorlica et al, 2019).…”
Section: Dealing With Constraintsmentioning
confidence: 99%
“…BwK is studied both in an adversarial and i.i.d. settings, but here we only emphasize on the latter (see Immorlica et al (2019) for the adversarial case). Assuming concave reward functions, Agrawal and Devanur (2014) proposes an Upper-Confidence Bound type of algorithms which achieves sublinear rates of regret and constraint violations.…”
Section: Related Workmentioning
confidence: 99%
“…The authors in Badanidiyuru et al (2018) introduce Bandits with Knapsack that combines online learning with integer programming for learning under constraints. This setting has been extended to various other settings like linear contextual bandits (Agrawal and Devanur, 2016), combinatorial semi-bandits (Abinav and Slivkins, 2018), adversarial setting (Immorlica et al, 2019), cascading bandits (Zhou et al, 2018). The authors in (Combes et al, 2015) establish lower bound for budgeted bandits and develop algorithms with matching upper bounds.…”
Section: Related Workmentioning
confidence: 99%