Imitation learning aims at recovering expert policies from limited demonstration data. Generative Adversarial Imitation Learning (GAIL) employs the generative adversarial learning framework for imitation learning and has shown great potentials. GAIL and its variants, however, are found highly sensitive to hyperparameters and hard to converge well in practice. One key issue is that the supervised learning discriminator has a much faster learning speed than the reinforcement learning generator, making the generator gradient vanishing. Although GAIL is formulated as a zero-sum adversarial game, the ultimate goal of GAIL is to learn the generator, thus the discriminator should play the role more like a teacher rather than a real opponent. Therefore, the learning of the discriminator should consider how the generator could learn. In this paper, we disclose that enhancing the gradient of the generator training is equivalent to increase the variance of the fake reward provided by the discriminator output. We thus propose an improved version of GAIL, GAIL-VR, in which the discriminator also learns to avoid generator gradient vanishing through regularization of the fake rewards variance. Experiments in various tasks, including locomotion tasks and Atari games, indicate that GAIL-VR can improve the training stability and imitation scores.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.