2022
DOI: 10.48550/arxiv.2206.02072
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Deciding What to Model: Value-Equivalent Sampling for Reinforcement Learning

Abstract: The quintessential model-based reinforcement-learning agent iteratively refines its estimates or prior beliefs about the true underlying model of the environment. Recent empirical successes in model-based reinforcement learning with function approximation, however, eschew the true model in favor of a surrogate that, while ignoring various facets of the environment, still facilitates effective planning over behaviors. Recently formalized as the value equivalence principle, this algorithmic technique is perhaps … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 48 publications
0
1
0
Order By: Relevance
“…Crucially, Arumugam and Van Roy [2022b] establish an information-theoretic Bayesian regret bound for a posterior-sampling algorithm that performs probability matching with respect to M…”
Section: Learning Targets For Capacity-limited Decision Makingmentioning
confidence: 99%
“…Crucially, Arumugam and Van Roy [2022b] establish an information-theoretic Bayesian regret bound for a posterior-sampling algorithm that performs probability matching with respect to M…”
Section: Learning Targets For Capacity-limited Decision Makingmentioning
confidence: 99%