Economic Properties of the Risk Sensitive Criterion for Portfolio Management

A linear function approximation based reinforcement learning algorithm is proposed for Markov decision processes with infinite horizon risk-sensitive cost. Its convergence is proved using the 'o.d.e. method' for stochastic approximation. The scheme is also extended to continuous state space processes. 1. Introduction. Recent decades have seen a major activity in approximate dynamic programming for Markov decision processes based on real or simulated data, using reinforcement learning algorithms. (See, e.g., Bertsekas and Tsitsiklis (1996) [10] and Sutton and Barto (1998) [30] for book length treatments and Si et al (2004) [28] for a flavour of more recent activity. While most of this work has focused on the additive cost criteria such as discounted or time-averaged cost, relatively little has been done for the multiplicative cost (or risk-sensitive cost as it is better known). There is, however, a lot of interest in this cost criterion, as it has important applications in finance, e.g., Bagchi and Sureshkumar (2002) (1983) [5] were developed. These were 'raw' in the sense that there was no explicit approximation of the value function in order to beat down the curse of dimensionality. In case of additive costs, there is a considerable body of work on such approximation architectures, one of the most popular being the linear function approximation. Here one seeks an approximation of the value function in terms of linear combination of a moderate number of basis functions specified a priori. The learning scheme then iteratively learns the weights (or coefficients) of this linear combination instead of learning the full value function, which is a much higher dimensional object. The first rigorous analysis of such a scheme is in Tsitsiklis and Van Roy (1997) [31], where its convergence was proved for the problem of policy evaluation. Since then there have been several variations of the basic theme, see, e.g., Bertsekas, Borkar and Nedic (2004) [8] and the references therein. The aim of this article is to propose a similar linear function approximation based learning scheme for policy evaluation in risk-sensitive control and justify it rigorously.

show abstract

“…Under (14), the iteration (13) has the same asymptotic behavior as the o.d.e. (see Lemma A.1 of the Appendix)…”

Section: Convergence Analysismentioning

confidence: 90%

A Learning Algorithm for Risk-Sensitive Cost

2008

View full text Add to dashboard Cite

show abstract

“…Hence the second term accounts for the variability of X (for a discussion see Bielecki & Pliska (2003)). If U is concave, the variance is subtracted and hence the decision maker is risk seeking in case cost are minimized, if U is convex, then the variance is added and the decision maker is risk averse.…”

Section: General Risk-sensitive Markov Decision Processesmentioning

confidence: 99%

More Risk-Sensitive Markov Decision Processes

2014

View full text Add to dashboard Cite

Abstract. We investigate the problem of minimizing a certainty equivalent of the total or discounted cost over a finite and an infinite horizon which is generated by a Markov Decision Process (MDP). The certainty equivalent is defined by U −1 (E U (Y )) where U is an increasing function. In contrast to a risk-neutral decision maker this optimization criterion takes the variability of the cost into account. It contains as a special case the classical risk-sensitive optimization criterion with an exponential utility. We show that this optimization problem can be solved by an ordinary MDP with extended state space and give conditions under which an optimal policy exists. In the case of an infinite time horizon we show that the minimal discounted cost can be obtained by value iteration and can be characterized as the unique solution of a fixed point equation using a 'sandwich' argument. Interestingly, it turns out that in case of a power utility, the problem simplifies and is of similar complexity than the exponential utility case, however has not been treated in the literature so far. We also establish the validity (and convergence) of the policy improvement method. A simple numerical example, namely the classical repeated casino game is considered to illustrate the influence of the certainty equivalent and its parameters. Finally also the average cost problem is investigated. Surprisingly it turns out that under suitable recurrence conditions on the MDP for convex power utility U , the minimal average cost does not depend on U and is equal to the risk neutral average cost. This is in contrast to the classical risk sensitive criterion with exponential utility.

show abstract

“…Although this formulation seems rather ad-hoc and remote from utility theory, it can in fact be interpreted as a utility maximization problem, as shown by Bielecki and Pliska [3]. Indeed, the criterion…”

Section: Optimization Criterionmentioning

confidence: 99%

Risk-sensitive benchmarked asset management

Davis

Lleo

2008

Quantitative Finance

View full text Add to dashboard Cite

This paper extends the risk-sensitive asset management theory developed by Bielecki and Pliska and by Kuroda and Nagai to the case where the investor's objective is to outperform an investment benchmark. The main result is a mutual fund theorem. Every investor following the same benchmark will take positions, in proportions dependent on his/her risk sensitivity coefficient, in two funds: the logoptimal portfolio and a second fund which adjusts for the correlation between the traded assets, the benchmark and the underlying valuation factors.

show abstract

Economic Properties of the Risk Sensitive Criterion for Portfolio Management

Cited by 50 publications

References 30 publications

A Learning Algorithm for Risk-Sensitive Cost

A Learning Algorithm for Risk-Sensitive Cost

More Risk-Sensitive Markov Decision Processes

Risk-sensitive benchmarked asset management

Contact Info

Product

Resources

About