2023
DOI: 10.1111/mafi.12388
|View full text |Cite
|
Sign up to set email alerts
|

Reinforcement learning with dynamic convex risk measures

Abstract: We develop an approach for solving time‐consistent risk‐sensitive stochastic optimization problems using model‐free reinforcement learning (RL). Specifically, we assume agents assess the risk of a sequence of random variables using dynamic convex risk measures. We employ a time‐consistent dynamic programming principle to determine the value of a particular policy, and develop policy gradient update rules that aid in obtaining optimal policies. We further develop an actor–critic style algorithm using neural net… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
9
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
6

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(9 citation statements)
references
References 68 publications
0
9
0
Order By: Relevance
“…Financial applications such as statistical arbitrage and portfolio optimization are discussed with detailed numerical examples. Coache and Jaimungal (2021) develops a framework combining policy‐gradient‐based RL method and dynamic convex risk measures for solving time‐consistent risk‐sensitive stochastic optimization problems. However, there is no sample complexity or asymptotic convergence studied for the proposed algorithms in Jaimungal et al.…”
Section: Further Developments For Mathematical Finance and Reinforcem...mentioning
confidence: 99%
See 1 more Smart Citation
“…Financial applications such as statistical arbitrage and portfolio optimization are discussed with detailed numerical examples. Coache and Jaimungal (2021) develops a framework combining policy‐gradient‐based RL method and dynamic convex risk measures for solving time‐consistent risk‐sensitive stochastic optimization problems. However, there is no sample complexity or asymptotic convergence studied for the proposed algorithms in Jaimungal et al.…”
Section: Further Developments For Mathematical Finance and Reinforcem...mentioning
confidence: 99%
“…However, there is no sample complexity or asymptotic convergence studied for the proposed algorithms in Jaimungal et al. (2021); Coache and Jaimungal (2021).…”
Section: Further Developments For Mathematical Finance and Reinforcem...mentioning
confidence: 99%
“…A recent development in risk-aware RL is that in [9]. The authors use dynamic convex risk measures and devise a model-free approach to solve finite-horizon RL problems in a time-consistent manner.…”
mentioning
confidence: 99%
“…This extends the work from [63] that studies optimal stationary policies under dynamic coherent risk measures. The authors of [9] also demonstrate the performance and flexibility of their approach on several benchmark examples, which, by generating strategies that mitigate risk and not simply maximizing expectation, effectively accounts for uncertainty in the data-generating processes. In both works, one downside of the proposed actor-critic algorithms is the use of a nested simulation or simulation upon simulation approach.…”
mentioning
confidence: 99%
See 1 more Smart Citation