2021
DOI: 10.48550/arxiv.2110.04652
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Representation Learning for Online and Offline RL in Low-rank MDPs

Abstract: This work studies the question of Representation Learning in RL: how can we learn a compact low-dimensional representation such that on top of the representation we can perform RL procedures such as exploration and exploitation, in a sample efficient manner. We focus on the low-rank Markov Decision Processes (MDPs) where the transition dynamics correspond to a low-rank transition matrix. Unlike prior works that assume the representation is known (e.g., linear MDPs), here we need to learn the representation for… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
31
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 9 publications
(32 citation statements)
references
References 30 publications
1
31
0
Order By: Relevance
“…The second line of work concerns rich observation RL, where the observation space can be infinite and arbitrarily complex, in (for the most part) Markovian environments. These works provide structural conditions that permit sample efficient RL with function approximation Jiang et al (2017); Sun et al (2019); Jin et al (2021); Du et al (2021);Foster et al (2021) as well as algorithms that are provably efficient in some special cases (Du et al, 2019;Misra et al, 2020;Agarwal et al, 2020;Uehara et al, 2021). However, as we will see, these structural conditions are not satisfied in our POMDP model so these results do not directly apply.…”
Section: Related Workmentioning
confidence: 99%
“…The second line of work concerns rich observation RL, where the observation space can be infinite and arbitrarily complex, in (for the most part) Markovian environments. These works provide structural conditions that permit sample efficient RL with function approximation Jiang et al (2017); Sun et al (2019); Jin et al (2021); Du et al (2021);Foster et al (2021) as well as algorithms that are provably efficient in some special cases (Du et al, 2019;Misra et al, 2020;Agarwal et al, 2020;Uehara et al, 2021). However, as we will see, these structural conditions are not satisfied in our POMDP model so these results do not directly apply.…”
Section: Related Workmentioning
confidence: 99%
“…Note that from here, our analysis significantly departs from the prior Block MDP works' analysis and the analysis of MOFFLE which rely on a reachability assumption. This part of our proof leverages some ideas in the analysis of a recent model-based representation learning algorithm REP-UCB (Uehara et al, 2021).…”
Section: Proof Sketchmentioning
confidence: 98%
“…Low-rank MDPs Low-rank MDP is strictly more general than linear MDPs which assume representation is known a priori. There are several related papers come from the recent literature on provable representation learning for low-rank MDPs (Agarwal et al, 2020b;Modi et al, 2021;Uehara et al, 2021;Ren et al, 2021). Low-rank MDPs generalize Block MDPs, so these algorithms are applicable in our setting.…”
Section: Rl With Function Approximationmentioning
confidence: 99%
See 2 more Smart Citations