2017
DOI: 10.48550/arxiv.1707.06203
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Imagination-Augmented Agents for Deep Reinforcement Learning

Abstract: We introduce Imagination-Augmented Agents (I2As), a novel architecture for deep reinforcement learning combining model-free and model-based aspects. In contrast to most existing model-based reinforcement learning and planning methods, which prescribe how a model should be used to arrive at a policy, I2As learn to interpret predictions from a learned environment model to construct implicit plans in arbitrary ways, by using the predictions as additional context in deep policy networks. I2As show improved data ef… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
51
0

Year Published

2018
2018
2023
2023

Publication Types

Select...
6
3
1

Relationship

0
10

Authors

Journals

citations
Cited by 43 publications
(51 citation statements)
references
References 28 publications
0
51
0
Order By: Relevance
“…Our results provide a theoretical foundation for the optimality of deep imagination in model-based planning by showing that it becomes the dominant strategy in one-shot allocations of resources over a broad range of capacity and environmental parameters. Recent deep-learning work has studied through numerical simulations how agents can benefit from imagining future steps by using models of the environment [46,47,48,49], and thus our results might help to clarify and stress the importance of deep tree sampling through mental simulations of state transitions.…”
Section: Discussionmentioning
confidence: 83%
“…Our results provide a theoretical foundation for the optimality of deep imagination in model-based planning by showing that it becomes the dominant strategy in one-shot allocations of resources over a broad range of capacity and environmental parameters. Recent deep-learning work has studied through numerical simulations how agents can benefit from imagining future steps by using models of the environment [46,47,48,49], and thus our results might help to clarify and stress the importance of deep tree sampling through mental simulations of state transitions.…”
Section: Discussionmentioning
confidence: 83%
“…Many models are proposed to solve Sokoban. Both modelbased methods [9,10,21], as well as model-free methods can reach competitive performance [8]. Curriculum learning has been used to solve a difficult Sokoban instance [6].…”
Section: Sokobanmentioning
confidence: 99%
“…Trajectories from a learned model are also used as extra inputs for a value function (Weber et al, 2017), which reduces the negative influence of the model prediction error. In this paper, we focus on the simplest Dyna-style planning and leave the combination of RA and more advanced planning techniques for future work.…”
Section: Related Workmentioning
confidence: 99%