Congbin Wu scite author profile

Congbin Wu

2Publications

69Citation Statements Received

4Citation Statements Given

How they've been cited

How they cite others

Affiliations

Tsinghua University

Publications

Order By: Most citations

Minimizing Risk Models in Markov Decision Processes with Policies Depending on Target Values

Lin

1999

Journal of Mathematical Analysis and Applications

View full text Add to dashboard Cite

This paper studies the minimizing risk problems in Markov decision processes with countable state space and reward set. The objective is to find a policy which Ž . minimizes the probability risk that the total discounted rewards do not exceed a Ž . specified value target . In this sort of model, the decision made by the decision maker depends not only on system's states, but also on his target values. By introducing the decision-maker's state, we formulate a framework for minimizing risk models. The policies discussed depend on target values and the rewards may be arbitrary real numbers. For the finite horizon model, the main results obtained Ž . Ž . are: i The optimal value functions are distribution functions of the target, ii Ž . there exists an optimal deterministic Markov policy, and iii a policy is optimal if and only if at each realizable state it always takes optimal action. In addition, we obtain a sufficient condition and a necessary condition for the existence of finite horizon optimal policy independent of targets and we give an algorithm computing finite horizon optimal policies and optimal value functions. For an infinite horizon model, we establish the optimality equation and we obtain the structure property of optimal policy. We prove that the optimal value function is a distribution function of target and we present a new approximation formula which is the generalization of the nonnegative rewards cases. An example which illustrates the mistakes of previous literature shows that the existence of optimal policy has not been proved really. In this paper, we give an existence condition, which is a sufficient and necessary condition for the existence of an infinite horizon optimal policy independent of targets, and we point out that whether there exists an optimal policy remains an open problem in the general case.

show abstract

Optimal models with maximizing probability of first achieving target value in the preceding stages

Lin

Kang

2003

Sci. China Ser. A-Math.

View full text Add to dashboard Cite

scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.

Contact Info

customersupport@researchsolutions.com

10624 S. Eastern Ave., Ste. A-614

Henderson, NV 89052, USA

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Blog Terms and Conditions API Terms Privacy Policy Contact Cookie Preferences Do Not Sell or Share My Personal Information

Made with 💙 for researchers

Part of the Research Solutions Family.

Congbin Wu

Minimizing Risk Models in Markov Decision Processes with Policies Depending on Target Values

Optimal models with maximizing probability of first achieving target value in the preceding stages

Contact Info

Product

Resources

About