2018
DOI: 10.1609/icaps.v28i1.13906
|View full text |Cite
|
Sign up to set email alerts
|

An On-Line Planner for POMDPs with Large Discrete Action Space: A Quantile-Based Approach

Abstract: Making principled decisions in the presence of uncertainty is often facilitated by Partially Observable Markov Decision Processes (POMDPs). Despite tremendous advances in POMDP solvers, finding good policies with large action spaces remains difficult. To alleviate this difficulty, this paper presents an on-line approximate solver, called Quantile-Based Action Selector (QBASE). It uses quantile-statistics to adaptively evaluate a small subset of the action space without sacrificing the quality of the generated … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
1
1

Relationship

2
4

Authors

Journals

citations
Cited by 7 publications
(3 citation statements)
references
References 13 publications
0
3
0
Order By: Relevance
“…The Cross-Entropy method has been used in several algorithms for solving POMDPs and MDPs (the fully observable variant of POMDPs). Several of them consider discrete action spaces (Mannor, Rubinstein, and Gat 2003;Oliehoek, Kooij, and Vlassis 2008;Wang, Kurniawati, and Kroese 2018), while we consider POMDPs with continuous action spaces. Omidshafiei et al (2016) consider continuous actions spaces, but the optimization is carried out over a finite policy space.…”
Section: Related Workmentioning
confidence: 99%
“…The Cross-Entropy method has been used in several algorithms for solving POMDPs and MDPs (the fully observable variant of POMDPs). Several of them consider discrete action spaces (Mannor, Rubinstein, and Gat 2003;Oliehoek, Kooij, and Vlassis 2008;Wang, Kurniawati, and Kroese 2018), while we consider POMDPs with continuous action spaces. Omidshafiei et al (2016) consider continuous actions spaces, but the optimization is carried out over a finite policy space.…”
Section: Related Workmentioning
confidence: 99%
“…More recently, MCTS-based methods have been extended to handle problems with large or continuous action and observation spaces. QBASE (Wang et al, 2018) can handle problems with up to 10,000 discrete actions by extending the cross-entropy method (Rubinstein and Kroese, 2013) to MCTS. GPS-ABT (Seiler et al, 2015) uses General Pattern Search, a derivative-free optimization method to search for local optima in continuous action spaces.…”
Section: Related Pomdp Solversmentioning
confidence: 99%
“…Methods have been proposed to alleviate this issue (Seiler, Kurniawati, and Singh 2015), (Sunberg and Kochenderfer 2018). However, existing solvers can only perform well for problems with 3-4 continuous action spaces (Seiler, Kurniawati, and Singh 2015), while a method that can perform well for problems with 100,000 discrete actions was only recently proposed (Wang, Kurniawati, and Kroese 2018). To alleviate this issue, we resort to use a small discrete number of fixed increments and decrements of the joint angles as the action space.…”
Section: Introductionmentioning
confidence: 99%