2023
DOI: 10.48550/arxiv.2302.04375
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

A Near-Optimal Algorithm for Safe Reinforcement Learning Under Instantaneous Hard Constraints

Abstract: In many applications of Reinforcement Learning (RL), it is critically important that the algorithm performs safely, such that instantaneous hard constraints are satisfied at each step, and unsafe states and actions are avoided. However, existing algorithms for "safe" RL are often designed under constraints that either require expected cumulative costs to be bounded or assume all states are safe. Thus, such algorithms could violate instantaneous hard constraints and traverse unsafe states (and actions) in pract… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 18 publications
0
1
0
Order By: Relevance
“…Moreover, their algorithmic techniques rely heavily on our optimism-pessimism principle. Shi et al (2023) later extended the per-step constrained RL work of Amani et al (2021) to the case where some state/action combinations are unsafe.…”
Section: Related Workmentioning
confidence: 99%
“…Moreover, their algorithmic techniques rely heavily on our optimism-pessimism principle. Shi et al (2023) later extended the per-step constrained RL work of Amani et al (2021) to the case where some state/action combinations are unsafe.…”
Section: Related Workmentioning
confidence: 99%