Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence 2018
DOI: 10.24963/ijcai.2018/687
|View full text |Cite
|
Sign up to set email alerts
|

Behavioral Cloning from Observation

Abstract: Humans often learn how to perform tasks via imitation: they observe others perform a task, and then very quickly infer the appropriate actions to take based on their observations. While extending this paradigm to autonomous agents is a wellstudied problem in general, there are two particular aspects that have largely been overlooked: (1) that the learning is done from observation only (i.e., without explicit action information), and (2) that the learning is typically done very quickly. In this work, we propose… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
229
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
5
4
1

Relationship

1
9

Authors

Journals

citations
Cited by 356 publications
(229 citation statements)
references
References 1 publication
0
229
0
Order By: Relevance
“…(2) Using the learned state-transition model to improve behavioral cloning by taking advantage of (visually) predicted consequences of non-expert actions. Our second component only relies on expert images, similar to zero-shot learning works [15,16]. We conduct our experiments using a ground mobility robot in both real-world office and simulator environments containing everyday objects (e.g., desks, chairs, monitors) without any artificial markers or object detectors.…”
Section: Introductionmentioning
confidence: 99%
“…(2) Using the learned state-transition model to improve behavioral cloning by taking advantage of (visually) predicted consequences of non-expert actions. Our second component only relies on expert images, similar to zero-shot learning works [15,16]. We conduct our experiments using a ground mobility robot in both real-world office and simulator environments containing everyday objects (e.g., desks, chairs, monitors) without any artificial markers or object detectors.…”
Section: Introductionmentioning
confidence: 99%
“…The authors of [20] and [21] have successfully implemented IRL methods for perception and control tasks, however, the need for the extra step of solving an RL problem adds to training delays. Instead, designing the problem as that of behavioral cloning (BC) [22] gets rid of the reward recovery step, and directly optimizes over a policy given reference demonstrations. Along similar lines, guided policy search (GPS) [23] techniques can be used to reduce training times by directing policy learning in turn avoiding poor local optima.…”
Section: A Related Workmentioning
confidence: 99%
“…Imitation learning methods focus on the problem of learning and perform a task by learning from demonstration data. These methods can be roughly divided into three categories: Behavior Cloning (BC; or supervised learning) [15], [16], Inverse reinforcement learning (IRL) [17], and Generative Adversarial Network (GAN) imitation learning [18].…”
Section: Related Work a Imitation Learningmentioning
confidence: 99%