In this article, we develop a novel role for the initial function v 0 in the value iteration algorithm. In case the optimal policy of a countable state Markovian queueing control problem has a threshold or switching curve structure, we conjecture, that one can tune the choice of v 0 to generate monotonic sequences of n-stage threshold or switching curve optimal policies. We will show this for three queueing control models, the M/M/1 queue with admission and with service control, and the two-competing queues model with quadratic holding cost. As a consequence, we obtain increasingly tighter upper and lower bounds. After a finite number of iterations, either the optimal threshold, or the optimal switching curve values in a finite number of states is available. This procedure can be used to increase numerical efficiency.
KEYWORDSderiving bounds, optimal policies, value iteraton 1 638This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.