2020
DOI: 10.1007/s10957-020-01681-2
|View full text |Cite
|
Sign up to set email alerts
|

Reachability and Safety Objectives in Markov Decision Processes on Long but Finite Horizons

Abstract: We consider discrete-time Markov decision processes in which the decision maker is interested in long but finite horizons. First we consider reachability objective: the decision maker’s goal is to reach a specific target state with the highest possible probability. A strategy is said to overtake another strategy, if it gives a strictly higher probability of reaching the target state on all sufficiently large but finite horizons. We prove that there exists a pure stationary strategy that is not overtaken by any… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 19 publications
0
2
0
Order By: Relevance
“…Markov decision processes (MDPs) are a standard model for dynamic systems that exhibit both stochastic and controlled behavior [17]. Applications include control theory [6,1], operations research and finance [2,4,19], artificial intelligence and machine learning [22,20], and formal verification [10,3].…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…Markov decision processes (MDPs) are a standard model for dynamic systems that exhibit both stochastic and controlled behavior [17]. Applications include control theory [6,1], operations research and finance [2,4,19], artificial intelligence and machine learning [22,20], and formal verification [10,3].…”
Section: Introductionmentioning
confidence: 99%
“…directly. (2) The mean payoff considers the lim inf of the sequence 1 (3) The total payoff considers the lim inf of the sequence n−1 i=0 r i n∈N , i.e., the sum of all rewards seen so far. For each of the three cases above, the lim inf threshold objective is to maximize the probability that the lim inf of the respective type of sequence is ≥ 0.…”
Section: Introductionmentioning
confidence: 99%