2020 American Control Conference (ACC) 2020
DOI: 10.23919/acc45564.2020.9147792
|View full text |Cite
|
Sign up to set email alerts
|

Risk-Averse Planning Under Uncertainty

Abstract: We consider the problem of designing policies for partially observable Markov decision processes (POMDPs) with dynamic coherent risk objectives. Synthesizing risk-averse optimal policies for POMDPs requires infinite memory and thus undecidable. To overcome this difficulty, we propose a method based on bounded policy iteration for designing stochastic but finite state (memory) controllers, which takes advantage of standard convex optimization methods. Given a memory budget and optimality criterion, the proposed… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
24
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3
1

Relationship

3
6

Authors

Journals

citations
Cited by 22 publications
(24 citation statements)
references
References 50 publications
0
24
0
Order By: Relevance
“…The method was applied to enforce risk-averse safety of a bipedal robot. Future work will extend the CVaR barrier functions to other coherent risk measures, continuoustime systems, and applications involving cooperative humanrobot teams and imperfect sensor measurements [34] and convex/polytopic approximations of barrier functions leading to convex programs.…”
Section: Discussionmentioning
confidence: 99%
“…The method was applied to enforce risk-averse safety of a bipedal robot. Future work will extend the CVaR barrier functions to other coherent risk measures, continuoustime systems, and applications involving cooperative humanrobot teams and imperfect sensor measurements [34] and convex/polytopic approximations of barrier functions leading to convex programs.…”
Section: Discussionmentioning
confidence: 99%
“…Conditional value-at-risk (CVaR) is one such risk measure that has this desirable set of properties, and is a part of a class of risk metrics known as coherent risk measures [6] Coherent risk measures have been used in a variety of decision making problems, especially Markov decision processes (MDPs) [10]. In recent years, Ahmadi et al synthesized risk averse optimal policies for partially observable MDPs, constrained MDPs, and for shortest path problems in MDPs [4,3,5]. Coherent risk measures have been used in a MPC framework when the system model is uncertain [36] and when the uncertainty is a result of measurement noise or moving obstacles [12].…”
Section: Related Workmentioning
confidence: 99%
“…Risk-averse dynamic programming methods that protect against large negative deviations from expected values have been developed for MDPs using average value-at-risk (VaR) [ 62 , 63 ]. Other coherent risk measures, including conditional-value-at-risk (CVaR) and entropic-value-at-risk (EVaR), have recently been used in risk-averse planning and decision optimization algorithms for both MDPs and POMDPs [ 64 , 65 ], as well as in model-free RL algorthms [ 66 ].…”
Section: Structure Of Explanations For Decision Recommendations Based On Reinforcement Learning (Rl) With Initially Unknown or Uncertain mentioning
confidence: 99%