Discrete versions of an algorithm due to Varaiya

Popyack, Jeffrey L.; Brown, Raymond D.; White, Chelsea C.

doi:10.1109/tac.1979.1102063

Cited by 19 publications

(4 citation statements)

References 3 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…These authors extended the original value-iteration bounds of MacQueen (1966) for the discounted cost case to the average cost case. The modified value-iteration algorithm with a dynamic relaxation factor comes from Popyack et al (1979). The first proof of the geometric convergence of the undiscounted value-iteration algorithm was given by White (1963) under a very strong recurrence condition.…”

Section: Bibliographic Notesmentioning

confidence: 99%

A First Course in Stochastic Models

Tijms¹

2003

706

509

View full text Add to dashboard Cite

Tijms, H. C.A first course in stochastic models / Henk C. Tijms. p. cm. Includes bibliographical references and index. ISBN 0-471-49880-7 (acid-free paper)-ISBN 0-471-49881-5 (pbk. : acid-free paper) 1. Stochastic processes. I. Title. QA274.T46 2003 519.2 3-dc21 2002193371 British Library Cataloguing in Publication DataA catalogue record for this book is available from the British Library ISBN 0-471-49880-7 (Cloth) ISBN 0-471-49881-5 (Paper)

show abstract

Section: Bibliographic Notesmentioning

confidence: 99%

A First Course in Stochastic Models

Tijms¹

2003

706

509

View full text Add to dashboard Cite

show abstract

“…we see that if 7yk = 1 for all k, the new value iteration (7)-(8) becomes similar to the known value iteration (9)-(10): the updating formulas are the same in both methods, but the order of updating A is just reversed relatively to the order of updating h. We note that there is also a variant of the standard method (9)-(10) that involves interpolations between hk and hk+1 according to a stepsize parameter (see [Sch71], [Pla77], [Var78], [PBW79], [Put94], [Ber95]). However, the new method does not seem as closely related to this variant.…”

Section: Furthermore A* Together With a Differential Cost Vector H =mentioning

confidence: 99%

A New Value Iteration method for the Average Cost Dynamic Programming Problem

Bertsekas¹

1998

SIAM J. Control Optim.

View full text Add to dashboard Cite

show abstract

“…This can be done by linear programming (LP) [3], [8], [12], [17], value iteration [4], [7], [9], [16], [18], [23], [25] or policy iteration [9], [10], [11]. Policy iteration was first proposed by Howard [9].…”

Section: •• •A(n)}ek:= K(i) X K(2) X•• • X K(n)mentioning

confidence: 99%

A new policy iteration scheme for Markov decision processes using Schweitzer's formula

Lasserre

1994

Journal of Applied Probability

View full text Add to dashboard Cite

Given a family of Markov chains with a single recurrent class, we present a potential application of Schweitzer's exact formula relating the steady-state probability and fundamental matrices of any two chains in the family. We propose a new policy iteration scheme for Markov decision processes where in contrast to policy iteration, the new criterion for selecting an action ensures the maximal one-step average cost improvement. Its computational complexity and storage requirement are analysed.

show abstract

Discrete versions of an algorithm due to Varaiya

Cited by 19 publications

References 3 publications

A First Course in Stochastic Models

A First Course in Stochastic Models

A New Value Iteration method for the Average Cost Dynamic Programming Problem

A new policy iteration scheme for Markov decision processes using Schweitzer's formula

Contact Info

Product

Resources

About