1982
DOI: 10.1017/s0021900200023123
|View full text |Cite
|
Sign up to set email alerts
|

The variance of discounted Markov decision processes

Abstract: Formulae are presented for the variance and higher moments of the present value of single-stage rewards in a finite Markov decision process. Similar formulae are exhibited for a semi-Markov decision process. There is a short discussion of the obstacles to using the variance formula in algorithms to maximize the mean minus a multiple of the standard deviation.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

3
90
0
6

Year Published

1987
1987
2023
2023

Publication Types

Select...
5
1

Relationship

0
6

Authors

Journals

citations
Cited by 79 publications
(99 citation statements)
references
References 9 publications
3
90
0
6
Order By: Relevance
“…(c) Note that if the semi-Markov kernel Q(·, ·|x, a) is taken some particular forms, our model can be reduced to the corresponding one of CTMDPs [10,11,12,20] or of DTMDPs [6,24,28,30]; see Section 5 for further details.…”
Section: The Control Modelmentioning
confidence: 99%
See 4 more Smart Citations
“…(c) Note that if the semi-Markov kernel Q(·, ·|x, a) is taken some particular forms, our model can be reduced to the corresponding one of CTMDPs [10,11,12,20] or of DTMDPs [6,24,28,30]; see Section 5 for further details.…”
Section: The Control Modelmentioning
confidence: 99%
“…The background of mean-variance problems arises from the tradeoff between the mean and variance, and the fact that a risk-aversion investor usually prefers to a return lower than the maximal one to keep a smaller variance risk. Due to this, mean-variance problems have been widely studied for various dynamic systems described by stochastic differential equations [5,7,22,31], Markov decision processes (MDPs) [2,3,8,10,13,21,27,28], and so on.…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations