Generative Adversarial Imitation from Observation

Torabi, Faraz; Warnell, Garrett; Stone, Peter

doi:10.48550/arxiv.1807.06158

Cited by 38 publications

(97 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Most directly relevant to our work are adversarial deep reinforcement learning methods which learn policies that are robust to various classes of disturbances, such as adversarial observations [40,10], rewards [8,11], direct disturbances to the system [26], or combinations thereof [21]. Nevertheless, there remains a paucity of theoretical guarantees on the generalization error, and thus sample-efficiency, of such learned policies.…”

Section: Related Workmentioning

confidence: 99%

Adversarially Robust Stability Certificates can be Sample-Efficient

Zhang¹,

Tu²,

Boffi³

et al. 2021

Preprint

View full text Add to dashboard Cite

Motivated by bridging the simulation to reality gap in the context of safety-critical systems, we consider learning adversarially robust stability certificates for unknown nonlinear dynamical systems. In line with approaches from robust control, we consider additive and Lipschitz bounded adversaries that perturb the system dynamics. We show that under suitable assumptions of incremental stability on the underlying system, the statistical cost of learning an adversarial stability certificate is equivalent, up to constant factors, to that of learning a nominal stability certificate. Our results hinge on novel bounds for the Rademacher complexity of the resulting adversarial loss class, which may be of independent interest. To the best of our knowledge, this is the first characterization of sample-complexity bounds when performing adversarial learning over data generated by a dynamical system. We further provide a practical algorithm for approximating the adversarial training algorithm, and validate our findings on a damped pendulum example.

show abstract

Section: Related Workmentioning

confidence: 99%

Adversarially Robust Stability Certificates can be Sample-Efficient

Zhang¹,

Tu²,

Boffi³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…IRL algorithms propose to infer the reward function from the expert demonstration. Among IRL algorithms, one recent branch is the Adversarial Imitation Learning (AIL) [7], [26], which trains the agent to match epert's behavior via an adversarial process. Compared with Behavioral Cloning, AIL can succeed in various challenging control tasks [7].…”

Section: A Imitation Learningmentioning

confidence: 99%

Cross Domain Robot Imitation with Invariant Representation

Yin¹,

Sun²,

Ma³

et al. 2021

Preprint

View full text Add to dashboard Cite

Animals are able to imitate each others' behavior, despite their difference in biomechanics. In contrast, imitating the other similar robots is a much more challenging task in robotics. This problem is called cross domain imitation learning (CDIL). In this paper, we consider CDIL on a class of similar robots. We tackle this problem by introducing an imitation learning algorithm based on invariant representation. We propose to learn invariant state and action representations, which aligns the behavior of multiple robots so that CDIL becomes possible. Compared with previous invariant representation learning methods for similar purpose, our method does not require human-labeled pairwise data for training. Instead, we use cycle-consistency and domain confusion to align the representation and increase its robustness. We test the algorithm on multiple robots in simulator and show that unseen new robot instances can be trained with existing expert demonstrations successfully. Qualitative results also demonstrate that the proposed method is able to learn similar representations for different robots with similar behaviors, which is essential for successful CDIL.

show abstract

“…Instead of learning from demonstrations supplied in the first-person, third-person imitation learning [32] improves upon GAIL by recovering a domain-agnostic representation of the agent's observations. Generative adversarial imitation from observation [33] learns directly from state-only demonstrations without having access to the demonstrator's actions by recovering the state-transition cost function of the expert.…”

Section: A Imitation Learning Via Inverse Reinforcement Learningmentioning

confidence: 99%

GAN-Based Interactive Reinforcement Learning from Demonstration and Human Evaluative Feedback

Huang,

Juan,

Gomez

et al. 2021

Preprint

View full text Add to dashboard Cite

Deep reinforcement learning (DRL) has achieved great successes in many simulated tasks. The sample inefficiency problem makes applying traditional DRL methods to real-world robots a great challenge. Generative Adversarial Imitation Learning (GAIL) -a general model-free imitation learning method, allows robots to directly learn policies from expert trajectories in large environments. However, GAIL shares the limitation of other imitation learning methods that they can seldom surpass the performance of demonstrations. In this paper, to address the limit of GAIL, we propose GAN-Based Interactive Reinforcement Learning (GAIRL) from demonstration and human evaluative feedback by combining the advantages of GAIL and interactive reinforcement learning. We tested our proposed method in six physics-based control tasks, ranging from simple low-dimensional control tasks -Cart Pole and Mountain Car, to difficult highdimensional tasks -Inverted Double Pendulum, Lunar Lander, Hopper and HalfCheetah. Our results suggest that with both optimal and suboptimal demonstrations, a GAIRL agent can always learn a more stable policy with optimal or close to optimal performance, while the performance of the GAIL agent is upper bounded by the performance of demonstrations or even worse than it. In addition, our results indicate the reason that GAIRL is superior over GAIL is the complementary effect of demonstrations and human evaluative feedback.

show abstract

Generative Adversarial Imitation from Observation

Cited by 38 publications

References 20 publications

Adversarially Robust Stability Certificates can be Sample-Efficient

Adversarially Robust Stability Certificates can be Sample-Efficient

Cross Domain Robot Imitation with Invariant Representation

GAN-Based Interactive Reinforcement Learning from Demonstration and Human Evaluative Feedback

Contact Info

Product

Resources

About