2017
DOI: 10.48550/arxiv.1706.02416
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Generalized Value Iteration Networks: Life Beyond Lattices

Abstract: In this paper, we introduce a generalized value iteration network (GVIN), which is an end-to-end neural network planning module. GVIN emulates the value iteration algorithm by using a novel graph convolution operator, which enables GVIN to learn and plan on irregular spatial graphs. We propose three novel differentiable kernels as graph convolution operators and show that the embedding-based kernel achieves the best performance. Furthermore, we present episodic Q-learning, an improvement upon traditional n-ste… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
5
0

Year Published

2018
2018
2022
2022

Publication Types

Select...
5
2

Relationship

0
7

Authors

Journals

citations
Cited by 8 publications
(5 citation statements)
references
References 15 publications
0
5
0
Order By: Relevance
“…It is promising because it can be integrated into a larger differentiable system to form a closed loop., Grimm et al (2020; propose to understand model-based planning algorithms from value equivalence perspective. Value iteration network (VIN) (Tamar et al, 2016) is a representative work that performs value iteration using convolution on lattice grids, and has been further extended (Niu et al, 2017;Lee et al, 2018;Chaplot et al, 2021;Deac et al, 2021). Other than using convolution network, the work on integrating learning and planning into differentiable networks includes (Oh et al, 2017;Karkus et al, 2017;Weber et al, 2018;Srinivas et al, 2018;Schrittwieser et al, 2019;Amos & Yarats, 2019;Wang & Ba, 2019;Hafner et al, 2020;Pong et al, 2018;Clavera et al, 2020).…”
Section: Differentiable Planningmentioning
confidence: 99%
“…It is promising because it can be integrated into a larger differentiable system to form a closed loop., Grimm et al (2020; propose to understand model-based planning algorithms from value equivalence perspective. Value iteration network (VIN) (Tamar et al, 2016) is a representative work that performs value iteration using convolution on lattice grids, and has been further extended (Niu et al, 2017;Lee et al, 2018;Chaplot et al, 2021;Deac et al, 2021). Other than using convolution network, the work on integrating learning and planning into differentiable networks includes (Oh et al, 2017;Karkus et al, 2017;Weber et al, 2018;Srinivas et al, 2018;Schrittwieser et al, 2019;Amos & Yarats, 2019;Wang & Ba, 2019;Hafner et al, 2020;Pong et al, 2018;Clavera et al, 2020).…”
Section: Differentiable Planningmentioning
confidence: 99%
“…For instance, Value Iteration Networks (VINs) are a kind of convolutional neural network that can learn formulate an MDP from an observation of an environment, solve that MDP, and use the result to choose an action (Tamar, Wu, Thomas, Levine, & Abbeel, 2016). Generalised VINs extend this approach to MDPs with more general transition dynamics by employing graph convolutional neural networks instead of ordinary convolutional neural networks (Niu, Chen, Guo, Targonski, Smith, & Kovačević, 2017). In a similar vein, schema networks learn a STRIPS-like environment model using a specially-structured neural network, then choose actions by planning on that learnt model (Kansky, Silver, Mély, Eldawy, Lázaro-Gredilla, Lou, Dorfman, Sidor, Phoenix, & George, 2017).…”
Section: Other Related Planning Workmentioning
confidence: 99%
“…The integration of planning and neural networks has also been investigated in the context of deep reinforcement learning. For instance, Value Iteration Networks (Tamar et al 2016;Niu et al 2017) (VINs) learn to formulate and solve a probabilistic planning problem within a larger deep neural network. A VIN's internal model can allow it to learn more robust policies than would be possible with ordinary feedforward neural networks.…”
Section: Related Workmentioning
confidence: 99%