Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence 2018
DOI: 10.24963/ijcai.2018/650
|View full text |Cite
|
Sign up to set email alerts
|

Planning and Learning with Stochastic Action Sets

Abstract: In many practical uses of reinforcement learning (RL) the set of actions available at a given state is a random variable, with realizations governed by an exogenous stochastic process. Somewhat surprisingly, the foundations for such sequential decision processes have been unaddressed. In this work, we formalize and investigate MDPs with stochastic action sets (SAS-MDPs) to provide these foundations. We show that optimal policies and value functions in this model have a structure that admits a compact represent… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
24
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 11 publications
(24 citation statements)
references
References 15 publications
0
24
0
Order By: Relevance
“…While there is extensive literature on solving sequential decision problems modeled as MDPs (Sutton and Barto 2018), there are few methods designed to handle stochastic action sets. Recently, Boutilier et al (2018) laid the foundation for studying MDPs with stochastic action sets by defining the new SAS-MDP problem formulation, which we review in the background section. After defining SAS-MDPs, Boutilier et al (2018) presented and analyzed the model-based value iteration and policy iteration algorithms and the model-free Q-learning algorithm for SAS-MDPs.…”
Section: Related Workmentioning
confidence: 99%
See 4 more Smart Citations
“…While there is extensive literature on solving sequential decision problems modeled as MDPs (Sutton and Barto 2018), there are few methods designed to handle stochastic action sets. Recently, Boutilier et al (2018) laid the foundation for studying MDPs with stochastic action sets by defining the new SAS-MDP problem formulation, which we review in the background section. After defining SAS-MDPs, Boutilier et al (2018) presented and analyzed the model-based value iteration and policy iteration algorithms and the model-free Q-learning algorithm for SAS-MDPs.…”
Section: Related Workmentioning
confidence: 99%
“…MDPs and SAS-MDPs (Boutilier et al 2018) are mathematical formulations of sequential decision problems. Before defining SAS-MDPs, we define MDPs.…”
Section: Introductionmentioning
confidence: 99%
See 3 more Smart Citations