2021
DOI: 10.48550/arxiv.2111.11485
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Free Lunch from the Noise: Provable and Practical Exploration for Representation Learning

Abstract: Representation learning lies at the heart of the empirical success of deep learning for dealing with the curse of dimensionality. However, the power of representation learning has not been fully exploited yet in reinforcement learning (RL), due to i), the trade-off between expressiveness and tractability; and ii), the coupling between exploration and representation learning. In this paper, we first reveal the fact that under some noise assumption in the stochastic control model, we can obtain the linear spectr… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(7 citation statements)
references
References 32 publications
0
7
0
Order By: Relevance
“…Remark 1. To our knowledge, only (Ren et al 2021) and this work provide examples of linear transitions in RL with continuous state-actions. The former consider Gaussian transitions with unknown mean (f ⋆ (s, a)) and known variance, i.e.…”
Section: Bilinear Exponential Family Of Mdpsmentioning
confidence: 99%
See 2 more Smart Citations
“…Remark 1. To our knowledge, only (Ren et al 2021) and this work provide examples of linear transitions in RL with continuous state-actions. The former consider Gaussian transitions with unknown mean (f ⋆ (s, a)) and known variance, i.e.…”
Section: Bilinear Exponential Family Of Mdpsmentioning
confidence: 99%
“…Thus, different works (Shariff and Szepesvári 2020;Lattimore, Szepesvari, and Weisz 2020;Van Roy and Dong 2019) propose to leverage different low-dimensional representations of value functions or transitions to perform efficient planning. Here, we take note from (Ren et al 2021) that Gaussian transitions induce an explicit linear value function in an RKHS. And generalize this observation with the bilinear exponential.…”
Section: Related Work: Functional Representations Of Mdps With Regret...mentioning
confidence: 99%
See 1 more Smart Citation
“…Low-rank MDPs Low-rank MDP is strictly more general than linear MDPs which assume representation is known a priori. There are several related papers come from the recent literature on provable representation learning for low-rank MDPs (Agarwal et al, 2020b;Modi et al, 2021;Uehara et al, 2021;Ren et al, 2021). Low-rank MDPs generalize Block MDPs, so these algorithms are applicable in our setting.…”
Section: Rl With Function Approximationmentioning
confidence: 99%
“…Several related works studied low-rank MDPs with provable sample complexities. [Agarwal et al, 2020b, Ren et al, 2021 and [Uehara et al, 2021] consider the model-based setting, where the algorithm learns the representation with the model class of the transition probability given. [Modi et al, 2021] provided a representation learning algorithm under the model-free setting and proved its sample efficiency when the MDP satisfies the minimal reachability assumption.…”
Section: Related Workmentioning
confidence: 99%