2022
DOI: 10.4208/jml.220114
|View full text |Cite
|
Sign up to set email alerts
|

Perturbational Complexity by Distribution Mismatch: A Systematic Analysis of Reinforcement Learning in Reproducing Kernel Hilbert Space

Abstract: Most existing theoretical analysis of reinforcement learning (RL) is limited to the tabular setting or linear models due to the difficulty in dealing with function approximation in high dimensional space with an uncertain environment. This work offers a fresh perspective into this challenge by analyzing RL in a general reproducing kernel Hilbert space (RKHS). We consider a family of Markov decision processes M of which the reward functions lie in the unit ball of an RKHS and transition probabilities lie in a g… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

0
15
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(15 citation statements)
references
References 24 publications
0
15
0
Order By: Relevance
“…This quantity can give both the lower bound and upper bound of the sample complexity of these RL problems and hence measure their difficulty. Moreover, both fast eigenvalue decay and finite concentration coefficient can lead to small perturbational complexity by distribution mismatch [35, Proposition 2 and 3] and hence results in [35] generalize both categories of the previous results in the nonlinear setting.…”
Section: (Ucb Estimation) If G N Additionally Satisfies Thatmentioning
confidence: 83%
See 4 more Smart Citations
“…This quantity can give both the lower bound and upper bound of the sample complexity of these RL problems and hence measure their difficulty. Moreover, both fast eigenvalue decay and finite concentration coefficient can lead to small perturbational complexity by distribution mismatch [35, Proposition 2 and 3] and hence results in [35] generalize both categories of the previous results in the nonlinear setting.…”
Section: (Ucb Estimation) If G N Additionally Satisfies Thatmentioning
confidence: 83%
“…To better capture the influence of the distribution mismatch in the RL problem, [35] introduce a quantity called perturbational complexity by distribution mismatch for a large class of the RL problems in the nonlinear setting when a generative model is accessible. This quantity can give both the lower bound and upper bound of the sample complexity of these RL problems and hence measure their difficulty.…”
Section: (Ucb Estimation) If G N Additionally Satisfies Thatmentioning
confidence: 99%
See 3 more Smart Citations