First passage risk probability optimality for continuous time Markov decision processes

Huo, Haifeng; Wen, Xian

doi:10.14736/kyb-2019-1-0114

Cited by 2 publications

(5 citation statements)

References 24 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The existence of optimal policies here is guaranteed by using the non-explosion of the controlled state process (see Assumption 1 in our paper), while the existence of optimal policies is guaranteed by using the non-explosion of the controlled state process and the properties of the target set B (see Assumption 3.2 and 3.6 in [15]). (iii) According to different policies, the probability space and the optimality equation in our paper are different from those developed in [15].…”

Section: Introductionmentioning

confidence: 99%

“…The results of this process can then be used to measure the risk of a stochastic system (economic and financial systems). Inspired by this situation, risk probability criteria have garnered significant attention and have been widely studied by [1,2,6,10,13,15,26,28,29,31] for Markov decision processes (for short MDPs).…”

Section: Introductionmentioning

confidence: 99%

“…Risk probability optimality problems for Markov decision processes are first divided into three groups that are based on the hold times of the system state: discrete-time Markov decision processes (DTMDPs) [2,26,28,29,30,31], semi-Markov decision processes (SMDPs) [10,11,12,13,25], and continuous-time Markov decision processes DOI: 10.14736/kyb-2021-2-0272 (CTMDPs) [14,15,16]. Then the second classification is grouped by the risk probability optimization problems with the reward case or the loss case.…”

Section: Introductionmentioning

confidence: 99%

“…Compared with the first passage risk probability CTMDPs developed in [15], the considered ones for finite-horizon CTMDPs with loss rates have many different characteristics due to their different performance criteria. (i) To define the policies, both loss levels λ and planning horizons t should be considered the extended states' components, while only reward levels λ have been considered in [15]. (ii) Our condition here is weaker than those proposed in [15].…”

Section: Introductionmentioning

confidence: 99%

“…(i) To define the policies, both loss levels λ and planning horizons t should be considered the extended states' components, while only reward levels λ have been considered in [15]. (ii) Our condition here is weaker than those proposed in [15]. The existence of optimal policies here is guaranteed by using the non-explosion of the controlled state process (see Assumption 1 in our paper), while the existence of optimal policies is guaranteed by using the non-explosion of the controlled state process and the properties of the target set B (see Assumption 3.2 and 3.6 in [15]).…”

Section: Introductionmentioning

confidence: 99%

See 4 more Smart Citations

Risk probability optimization problem for finite horizon continuous time Markov decision processes with loss rate

Huo¹,

Wen²

2021

Kybernetika

Self Cite

View full text Add to dashboard Cite

show abstract

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

Section: Introductionmentioning

confidence: 99%

See 3 more Smart Citations

Risk probability optimization problem for finite horizon continuous time Markov decision processes with loss rate

Huo¹,

Wen²

2021

Kybernetika

Self Cite

View full text Add to dashboard Cite

show abstract

The optimal probability of the risk for finite horizon partially observable Markov decision processes

Wen,

Huo,

Cui

2023

MATH

View full text Add to dashboard Cite

<abstract><p>This paper investigates the optimality of the risk probability for finite horizon partially observable discrete-time Markov decision processes (POMDPs). The probability of the risk is optimized based on the criterion of total rewards not exceeding the preset goal value, which is different from the optimal problem of expected rewards. Based on the Bayes operator and the filter equations, the optimization problem of risk probability can be equivalently reformulated as filtered Markov decision processes. As an advantage of developing the value iteration technique, the optimality equation satisfied by the value function is established and the existence of the risk probability optimal policy is proven. Finally, an example is given to illustrate the effectiveness of using the value iteration algorithm to compute the value function and optimal policy.</p></abstract>

show abstract

First passage risk probability optimality for continuous time Markov decision processes

Abstract: Institute of Mathematics of the Czech Academy of Sciences provides access to digitized documents strictly for personal use. Each copy of any part of this document must contain these Terms of use.

Cited by 2 publications

References 24 publications

Risk probability optimization problem for finite horizon continuous time Markov decision processes with loss rate

Risk probability optimization problem for finite horizon continuous time Markov decision processes with loss rate

The optimal probability of the risk for finite horizon partially observable Markov decision processes

Contact Info

Product

Resources

About