2013
DOI: 10.1007/978-3-642-41575-3_15
|View full text |Cite
|
Sign up to set email alerts
|

Controller Compilation and Compression for Resource Constrained Applications

Abstract: Abstract. Recent advances in planning techniques for partially observable Markov decision processes have focused on online search techniques and offline point-based value iteration. While these techniques allow practitioners to obtain policies for fairly large problems, they assume that a non-negligible amount of computation can be done between each decision point. In contrast, the recent proliferation of mobile and embedded devices has lead to a surge of applications that could benefit from state of the art p… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
4
0

Year Published

2015
2015
2016
2016

Publication Types

Select...
4
1

Relationship

1
4

Authors

Journals

citations
Cited by 5 publications
(4 citation statements)
references
References 11 publications
0
4
0
Order By: Relevance
“…It outperforms the state of the art in terms of both solution quality and time. The policies found are finite state controllers which are also advantageous for deployment in resource constrained applications such as embedded systems as well as smartphones (Grześ, Poupart, and Hoey 2013b). In future work, it would be interesting to extend this work to constrained decentralized POMDPs (Wu, Jennings, and Chen 2012) and to explore reinforcement learning techniques for CPOMDPs.…”
Section: Discussionmentioning
confidence: 94%
“…It outperforms the state of the art in terms of both solution quality and time. The policies found are finite state controllers which are also advantageous for deployment in resource constrained applications such as embedded systems as well as smartphones (Grześ, Poupart, and Hoey 2013b). In future work, it would be interesting to extend this work to constrained decentralized POMDPs (Wu, Jennings, and Chen 2012) and to explore reinforcement learning techniques for CPOMDPs.…”
Section: Discussionmentioning
confidence: 94%
“…One crucial difference between previous approaches to policy succinctness in both POMDP [22] and other settings [25] is that in previous work they concurrently optimize both the performance of a policy and its size, which requires dedicated algorithms, while we separate these tasks: first we search for a well-performing, though possibly "ugly" policy, and then learn its succinct representation (similar approach was used in [23], where policies computed by point-based methods were "compiled" into FSCs). Thus, we present a framework for obtaining succinct representations in which various state-of-the art algorithms for POMDP solving and DT learning can be used.…”
Section: Related Workmentioning
confidence: 99%
“…There are other approaches to compute policies for infinite-horizon Dec-POMDPs that are not based on a controller representation of the joint-policy (MacDermed & Isbell, 2013). However, a key advantage of policies based on finite-state controllers is their ease of execution in resource constrained environments (Grzes, Poupart, & Hoey, 2013;Grześ, Poupart, Yang, & Hoey, 2015), without any expensive belief update operations required in other approaches. Furthermore, policies represented as finite-state controllers can carry more semantic information, where each controller node summarizes some relevant aspects of the observation history.…”
Section: Related Workmentioning
confidence: 99%