2012
DOI: 10.1137/100798557
|View full text |Cite
|
Sign up to set email alerts
|

Action Time Sharing Policies for Ergodic Control of Markov Chains

Abstract: Ergodic control for discrete time controlled Markov chains with a locally compact state space and a compact action space is considered under suitable stability, irreducibility and Feller continuity conditions. A flexible family of controls, called action time sharing (ATS) policies, associated with a given continuous stationary Markov control, is introduced. It is shown that the long term average cost for such a control policy, for a broad range of one stage cost functions, is the same as that for the associat… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2017
2017
2017
2017

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 13 publications
0
1
0
Order By: Relevance
“…Methods in unknown MDP estimation and inverse reinforcement learning aim to learn an optimal policy while estimating an unknown quantity of the MDP, such as the transition law (Burnetas & Katehakis, 1997), secondary parameters (Budhiraja et al, 2012), and the reward function (Ng & Russell, 2000). The maximum entropy IRL framework has proved successful at learning reward functions from expert demonstrations (Ziebart et al, 2008;Boularias et al, 2011;Kalakrishnan et al, 2013).…”
Section: Related Workmentioning
confidence: 99%
“…Methods in unknown MDP estimation and inverse reinforcement learning aim to learn an optimal policy while estimating an unknown quantity of the MDP, such as the transition law (Burnetas & Katehakis, 1997), secondary parameters (Budhiraja et al, 2012), and the reward function (Ng & Russell, 2000). The maximum entropy IRL framework has proved successful at learning reward functions from expert demonstrations (Ziebart et al, 2008;Boularias et al, 2011;Kalakrishnan et al, 2013).…”
Section: Related Workmentioning
confidence: 99%