Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence 2021
DOI: 10.24963/ijcai.2021/404
|View full text |Cite
|
Sign up to set email alerts
|

Stochastic Shortest Path with Adversarially Changing Costs

Abstract: Stochastic shortest path (SSP) is a well-known problem in planning and control, in which an agent has to reach a goal state in minimum total expected cost. In this paper we present the adversarial SSP model that also accounts for adversarial changes in the costs over time, while the underlying transition function remains unchanged. Formally, an agent interacts with an SSP environment for K episodes, the cost function changes arbitrarily between episodes, and the transitions are unknown to the agent. We develop… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
12
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 7 publications
(12 citation statements)
references
References 1 publication
0
12
0
Order By: Relevance
“…In general, the costs may be chosen to be known (as in our case and Rosenberg et al [25]), unknown and requiring estimation as in Chen et al [7], or adversarial as in Neu et al [21], Rosenberg and Mansour [24], Chen et al [8]. The adversarial setting is quite distinct, while the known and unknown costs cases have similar approaches.…”
Section: Stochastic Shortest Paths With Unknown Transitionsmentioning
confidence: 99%
“…In general, the costs may be chosen to be known (as in our case and Rosenberg et al [25]), unknown and requiring estimation as in Chen et al [7], or adversarial as in Neu et al [21], Rosenberg and Mansour [24], Chen et al [8]. The adversarial setting is quite distinct, while the known and unknown costs cases have similar approaches.…”
Section: Stochastic Shortest Paths With Unknown Transitionsmentioning
confidence: 99%
“…Proposition 1 ( [21], [22], [24]): The set of all feasible occupancy measures Q is a non-empty polytope given by…”
Section: Preliminaries: Occupancy Measure In Episodic Mdpmentioning
confidence: 99%
“…The system restarts at the end of each episode, and a new episode begins with an initial state. Moreover, following the previous literature [20]- [24], we shall focus on the layered episodic MDP model as described in the following assumption.…”
Section: Problem Formulationmentioning
confidence: 99%
See 1 more Smart Citation
“…Related Work Regret minimization in SSP has received much attention recently for both stochastic environment (Tarbouriech et al, 2020;Cohen et al, 2020Cohen et al, , 2021Tarbouriech et al, 2021;Chen et al, 2021a,b;Jafarnia-Jahromi et al, 2021) and adversarial environment (Rosenberg and Mansour, 2021;Chen et al, 2021d;. All previous approaches are either value-based (e.g.…”
Section: Introductionmentioning
confidence: 99%