2021
DOI: 10.1007/978-3-030-76928-4_9
|View full text |Cite
|
Sign up to set email alerts
|

Robustness to Approximations and Model Learning in MDPs and POMDPs

Abstract: This thesis develops a topological approach to robustness to incorrect-incomplete models in MDPs and POMDPs, and applications of this study to approximations, model learning and reinforcement learning.

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1

Citation Types

0
4
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(4 citation statements)
references
References 91 publications
(183 reference statements)
0
4
0
Order By: Relevance
“…Since the constructed kernel P is weakly continuous, the cost function is bounded continuous, and the state space is compact, the proof is then a corollary of [53,Theorem 4.27] (see also [41,Theorem 11]) by viewing the finite window truncation as a quantization (with a uniformly bounded radius for each bin) of the state space under the product topology.…”
Section: Discounted Cost Criterion: Refined Existence Results and Nearmentioning
confidence: 99%
“…Since the constructed kernel P is weakly continuous, the cost function is bounded continuous, and the state space is compact, the proof is then a corollary of [53,Theorem 4.27] (see also [41,Theorem 11]) by viewing the finite window truncation as a quantization (with a uniformly bounded radius for each bin) of the state space under the product topology.…”
Section: Discounted Cost Criterion: Refined Existence Results and Nearmentioning
confidence: 99%
“…Models and Methods. Classical discrete-time Markov decision processes (MDPs) are considered in [3,4,6,8,9,12,15]; continuous-time Markov, semi-Markov, and more general processes are considered in [2,5,10,11,19]. Chapters [4,14,18] are about various types of stochastic games, including the game against the nature [4].…”
Section: Introductionmentioning
confidence: 99%
“…As for the methods, dynamic programming is useful on many occasions [3,4,6,[8][9][10]15]. When some probabilities (e.g., describing the dynamics of the process) are not precisely known, the Bayesian approach [9,12,14], Q-learning [3,4], optimal filtering [19], robust control [1,4,12], and H 2 control [5] can be useful. Let us also mention variational inequalities [11] and self-organizing algorithms [7].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation