2021
DOI: 10.48550/arxiv.2107.08829
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Visual Adversarial Imitation Learning using Variational Models

Abstract: Reward function specification, which requires considerable human effort and iteration, remains a major impediment for learning behaviors through deep reinforcement learning. In contrast, providing visual demonstrations of desired behaviors often presents an easier and more natural way to teach agents. We consider a setting where an agent is provided a fixed dataset of visual demonstrations illustrating how to perform a task, and must learn to solve the task using the provided demonstrations and unsupervised en… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

2
5
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(7 citation statements)
references
References 7 publications
2
5
0
Order By: Relevance
“…Our work bears some resemblance to the two concurrent works Rafailov et al (2021); Anonymous (2022). Compared to Rafailov et al (2021), we solve ILfVI from a model-free perspective, which improves the performance with merely off-policy samples instead of a learned model and even works for visual observations.…”
Section: Related Worksupporting
confidence: 56%
See 4 more Smart Citations
“…Our work bears some resemblance to the two concurrent works Rafailov et al (2021); Anonymous (2022). Compared to Rafailov et al (2021), we solve ILfVI from a model-free perspective, which improves the performance with merely off-policy samples instead of a learned model and even works for visual observations.…”
Section: Related Worksupporting
confidence: 56%
“…Torabi et al (2018b) conduct experiments of GAIfO with visual observations, showing that GAIfO only achieves about half of the expert-level performance in no-trivial environments. Two concurrent works (Rafailov et al, 2021;Anonymous, 2022) give further insights into ILfVI. Rafailov et al (2021) solve ILfVI from a model-based perspective, whose algorithm V-MAIL first learns a world model and then updates the discriminator with on-policy samples from the learned model.…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations