This paper deals with discrete-time Markov control processes in Borel spaces with unbounded rewards. Under suitable hypotheses, we show that a randomized stationary policy is optimal for a certain expected constrained problem (ECP) if and only if it is optimal for the corresponding pathwise constrained problem (pathwise CP). Moreover, we show that a certain parametric family of unconstrained optimality equations yields convergence properties that lead to an approximation scheme which allows us to obtain constrained optimal policies as the limit of unconstrained deterministic optimal policies. In addition, we give sufficient conditions for the existence of deterministic policies that solve these constrained problems.
This article is concerned with the limiting average variance for discrete-time Markov control processes in Borel spaces, subject to pathwise constraints. Under suitable hypotheses we show that within the class of deterministic stationary optimal policies for the pathwise constrained problem, there exists one with a minimal variance.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.