2019
DOI: 10.48550/arxiv.1906.06361
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Online Allocation and Pricing: Constant Regret via Bellman Inequalities

Abstract: We develop a framework for designing tractable heuristics for Markov Decision Processes (MDP), and use it to obtain constant regret policies for a variety of online allocation problems, including online packing, budget-constrained probing, dynamic pricing, and online contextual bandits with knapsacks. Our approach is based on adaptively constructing a benchmark for the value function, which we then use to select our actions. The centerpiece of our framework are the Bellman inequalities, which allow us to creat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Year Published

2020
2020
2020
2020

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
references
References 30 publications
0
0
0
Order By: Relevance