2015 American Control Conference (ACC) 2015
DOI: 10.1109/acc.2015.7171887
|View full text |Cite
|
Sign up to set email alerts
|

Finite state approximations of Markov decision processes with general state and action spaces

Abstract: The purpose of this paper is to prove existence of an ε-equilibrium point in a dynamic Nash game with Borel state space and long-run time average cost criteria for the players. The idea of the proof is first to convert the initial game with ergodic costs to an "equivalent" game endowed with discounted costs for some appropriately chosen value of the discount factor, and then to approximate the discounted Nash game obtained in the first step with a countable state space game for which existence of a Nash equili… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2015
2015
2024
2024

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(1 citation statement)
references
References 32 publications
0
1
0
Order By: Relevance
“…This technique merges the states bearing similar transition probabilities and rewards into one meta-state to obtain a smaller MDP. Several works have been reported on the upper bound derivation for deviation of the total aggregated MDP reward from that of the original MDP in both discounted ( [23], [24]) and non-discounted [25] MDPs. However, to our best knowledge, there exists no reported work on state aggregation for optimal energy management in large-scale data centers that incorporates the stochastic characteristics of the underlying system.…”
Section: State Of the Art And Prior Workmentioning
confidence: 99%
“…This technique merges the states bearing similar transition probabilities and rewards into one meta-state to obtain a smaller MDP. Several works have been reported on the upper bound derivation for deviation of the total aggregated MDP reward from that of the original MDP in both discounted ( [23], [24]) and non-discounted [25] MDPs. However, to our best knowledge, there exists no reported work on state aggregation for optimal energy management in large-scale data centers that incorporates the stochastic characteristics of the underlying system.…”
Section: State Of the Art And Prior Workmentioning
confidence: 99%