2022
DOI: 10.1007/978-3-031-14714-2_27
|View full text |Cite
|
Sign up to set email alerts
|

Generalization and Computation for Policy Classes of Generative Adversarial Imitation Learning

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

0
7
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(7 citation statements)
references
References 11 publications
0
7
0
Order By: Relevance
“…Within IL, IRL [2] has emerged as a significant approach. IRL, which focuses on inferring a reward function that leads to an optimal policy aligned with expert behavior, has shown remarkable effectiveness and attracted much attention in recent years [11,20,21,14,15,22]. The two primary challenges in IRL are the imitation of the policy and the recovery of reward functions [11,23].…”
Section: Related Workmentioning
confidence: 99%
See 3 more Smart Citations
“…Within IL, IRL [2] has emerged as a significant approach. IRL, which focuses on inferring a reward function that leads to an optimal policy aligned with expert behavior, has shown remarkable effectiveness and attracted much attention in recent years [11,20,21,14,15,22]. The two primary challenges in IRL are the imitation of the policy and the recovery of reward functions [11,23].…”
Section: Related Workmentioning
confidence: 99%
“…IRL, by implementing distribution matching for imitation, has made substantial progress [4,20,21,15,22]. Drawing on the principles of generative adversarial networks (GANs) [24] used in IL, generative adversarial imitation learning (GAIL) [4] integrates a discriminator-generated reward system and a policy gradient approach to effectively mimic an expert RL policy.…”
Section: Policy Imitationmentioning
confidence: 99%
See 2 more Smart Citations
“…The GAIL can be bifurcated into two genres: stochastic policy algorithms and deterministic policy algorithms, namely DE-GAIL (Kostrikov et al 2019;Zuo et al 2020) and ST-GAIL (Ho and Ermon 2016;Zhou et al 2022). The ST-GAIL with stochastic policy guarantees global convergence in high-dimensional environments, outperforming traditional Inverse Reinforcement Learning (IRL) methods (Ng, Russell et al 2000;Ziebart et al 2008;Boularias, Kober, and Peters 2011).…”
Section: Introductionmentioning
confidence: 99%