2021
DOI: 10.48550/arxiv.2106.03207
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Mitigating Covariate Shift in Imitation Learning via Offline Data Without Great Coverage

Abstract: This paper studies offline Imitation Learning (IL) where an agent learns to imitate an expert demonstrator without additional online environment interactions. Instead, the learner is presented with a static offline dataset of state-action-next state transition triples from a potentially less proficient behavior policy. We introduce Model-based IL from Offline data (MILO): an algorithmic framework that utilizes the static dataset to solve the offline IL problem efficiently both in theory and in practice. In the… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
2

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 39 publications
0
4
0
Order By: Relevance
“…It means that, with the increasing d, we can enrich the reward set R and then capture the performance of the policy more meticulously using the optimality gap defined in (2.6). Analysis in GAIL with linear reward set is more challenging than the case with bounded reward set, which is studied in the previous literature (Shani et al, 2021;Chang et al, 2021).…”
Section: Linear Function Approximationmentioning
confidence: 99%
See 3 more Smart Citations
“…It means that, with the increasing d, we can enrich the reward set R and then capture the performance of the policy more meticulously using the optimality gap defined in (2.6). Analysis in GAIL with linear reward set is more challenging than the case with bounded reward set, which is studied in the previous literature (Shani et al, 2021;Chang et al, 2021).…”
Section: Linear Function Approximationmentioning
confidence: 99%
“…Furthermore, previous theoretical analyses on GAIL either focus on the tabular case (Shani et al, 2021), where the state and action spaces are discrete, or relies on strong assumptions, including access to a wellexplored dataset (Zhang et al, 2020), linear-quadratic regulators (Cai et al, 2019), or kernelized nonlinear regulators (Chang et al, 2021). Theoretical analysis for GAIL with linear function approximation either in online or offline settings still remains an open problem, which is crucial for the application of GAIL in the continuous or high dimensional state and action spaces.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations