Constrained Discounted Markov Decision Chains

Abstract-The fundamental limits of remote estimation of autoregressive Markov processes under communication constraints are presented. The remote estimation system consists of a sensor and an estimator. The sensor observes a discrete-time autoregressive Markov process driven by a symmetric and unimodal innovations process. At each time, the sensor either transmits the current state of the Markov process or does not transmit at all. The estimator estimates the Markov process based on the transmitted observations. In such a system, there is a trade-off between communication cost and estimation accuracy. Two fundamental limits of this trade-off are characterized for infinite horizon discounted cost and average cost setups. First, when each transmission is costly, we characterize the minimum achievable cost of communication plus estimation error. Second, when there is a constraint on the average number of transmissions, we characterize the minimum achievable estimation error. Transmission and estimation strategies that achieve these fundamental limits are also identified.

show abstract

“…To describe the solution of Problem 2, we first define Bernoulli randomized strategy and Bernoulli randomized simple strategy [36].…”

Section: ) Results For Constrained Communicationmentioning

confidence: 99%

Fundamental Limits of Remote Estimation of Autoregressive Markov Processes Under Communication Constraints

Chakravorty

Mahajan

2017

IEEE Trans. Automat. Contr.

View full text Add to dashboard Cite

show abstract

“…Such problems frequently arise in computer networks and data communications, see Lazar [18], Spieksma and Hordijk [14], Nain and Ross [19] and Altman and Shwartz [1]. The theory for solving constrained MDPs was developed by Hordijk and Kallenberg [16], Kallenberg [17], Beutler and Ross [8], Altman and Shwartz [2,4], Altman [6], Spieksma [22], Sennott [20,21] and Borkar [9].…”

Section: Introductionmentioning

confidence: 99%

Asymptotic properties of constrained Markov Decision Processes

Altman

1993

ZOR - Methods and Models of Operations Research

View full text Add to dashboard Cite

We present in this paper several asymptotic properties of constrained Markov Decision Processes (MDPs) with a countable state space. We treat both the discounted and the expected average cost, with unbounded cost. We are interested in (1) the convergence of finite horizon MDPs to the infinite horizon MDP, (2) convergence of MDPs with a truncated state space to the problem with infinite state space, (3) convergence of MDPs as the discount factor goes to a limit. In all these cases we establish the convergence of optimal values and policies. Moreover, based on the optimal policy for the limiting problem, we construct policies which are almost optimal for the other (approximating) problems. Based on the convergence of MDPs with a truncated state space to the problem with infinite state space, we show that an optimal stationary policy exists such that the number of randomisations it uses is less or equal to the number of constraints plus one. We finally apply the results to a dynamic scheduling problem.

show abstract

“…The case of finite, denumerable, and compact state spaces has been widely dealed (Beutler and Ross 1985;Sennott 1993;Borkar 1994;Kurano et al 2000a;Piunovskiy 1993Piunovskiy , 1997Piunovskiy and Khametov 1991;Tanaka 1991;Hu and Yue 2008). The discounted performance criteria has also already been dealed (Feimberg and Shwartz 1996;González-Hernández and Hernández-Lerma 2005;Hernández-Lerma and González-Hernández 2000;and Sennott 1991). We can see other related aspects (Collins and McNamara 1998;Kurano et al 2000b;and Yushkevich 1997).…”

Section: Introductionmentioning

confidence: 94%

“…It is already known that for Markov decision constraint problems, there exist optimal randomized policies (Beutler and Ross 1985;Borkar 1994;Frid 1972;González-Hernández and Hernández-Lerma 2005;Haviv 1996;Sennott 1991Sennott , 1993. The case of finite, denumerable, and compact state spaces has been widely dealed (Beutler and Ross 1985;Sennott 1993;Borkar 1994;Kurano et al 2000a;Piunovskiy 1993Piunovskiy , 1997Piunovskiy and Khametov 1991;Tanaka 1991;Hu and Yue 2008).…”

Section: Introductionmentioning

confidence: 99%