2022
DOI: 10.48550/arxiv.2203.03110
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Cascaded Gaps: Towards Gap-Dependent Regret for Risk-Sensitive Reinforcement Learning

Abstract: In this paper, we study gap-dependent regret guarantees for risk-sensitive reinforcement learning based on the entropic risk measure. We propose a novel definition of sub-optimality gaps, which we call cascaded gaps, and we discuss their key components that adapt to the underlying structures of the problem. Based on the cascaded gaps, we derive non-asymptotic and logarithmic regret bounds for two model-free algorithms under episodic Markov decision processes. We show that, in appropriate settings, these bounds… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...

Citation Types

0
0
0

Publication Types

Select...

Relationship

0
0

Authors

Journals

citations
Cited by 0 publications
references
References 25 publications
0
0
0
Order By: Relevance

No citations

Set email alert for when this publication receives citations?