Controller Compilation and Compression for Resource Constrained Applications

Grzes, Marek; Poupart, Pascal; Hoey, Jesse

doi:10.1007/978-3-642-41575-3_15

Cited by 5 publications

(4 citation statements)

References 11 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…It outperforms the state of the art in terms of both solution quality and time. The policies found are finite state controllers which are also advantageous for deployment in resource constrained applications such as embedded systems as well as smartphones (Grześ, Poupart, and Hoey 2013b). In future work, it would be interesting to extend this work to constrained decentralized POMDPs (Wu, Jennings, and Chen 2012) and to explore reinforcement learning techniques for CPOMDPs.…”

Section: Discussionmentioning

confidence: 94%

Approximate Linear Programming for Constrained Partially Observable Markov Decision Processes

Poupart

Malhotra

Pei

et al. 2015

AAAI

Self Cite

View full text Add to dashboard Cite

In many situations, it is desirable to optimize a sequence of decisions by maximizing a primary objective while respecting some constraints with respect to secondary objectives. Such problems can be naturally modeled as constrained partially observable Markov decision processes (CPOMDPs) when the environment is partially observable. In this work, we describe a technique based on approximate linear programming to optimize policies in CPOMDPs. The optimization is performed offline and produces a finite state controller with desirable performance guarantees. The approach outperforms a constrained version of point-based value iteration on a suite of benchmark problems.

show abstract

Section: Discussionmentioning

confidence: 94%

Approximate Linear Programming for Constrained Partially Observable Markov Decision Processes

Poupart

Malhotra

Pei

et al. 2015

AAAI

Self Cite

View full text Add to dashboard Cite

show abstract

“…One crucial difference between previous approaches to policy succinctness in both POMDP [22] and other settings [25] is that in previous work they concurrently optimize both the performance of a policy and its size, which requires dedicated algorithms, while we separate these tasks: first we search for a well-performing, though possibly "ugly" policy, and then learn its succinct representation (similar approach was used in [23], where policies computed by point-based methods were "compiled" into FSCs). Thus, we present a framework for obtaining succinct representations in which various state-of-the art algorithms for POMDP solving and DT learning can be used.…”

Section: Related Workmentioning

confidence: 99%

Stochastic Shortest Path with Energy Constraints in POMDPs

Brázdil,

Chatterjee,

Chmelík

et al. 2016

Preprint

View full text Add to dashboard Cite

We consider partially observable Markov decision processes (POMDPs) with a set of target states and positive integer costs associated with every transition. The traditional optimization objective (stochastic shortest path) asks to minimize the expected total cost until the target set is reached. We extend the traditional framework of POMDPs to model energy consumption, which represents a hard constraint. There are energy levels that may increase and decrease with transitions, and the hard constraint requires that the energy level must remain positive in all steps till the target is reached. Our contribution is twofold. First, we present a novel algorithm for solving POMDPs with energy levels, developing on existing POMDP solvers and using real-time dynamic programming as its main method. Our second contribution is related to policy representation. For larger POMDP instances the policies computed by existing solvers are too large to be understandable. We present an automated procedure based on machine learning techniques that automatically extracts important decisions of a policy and computes its succinct, human readable representation. Finally, we show experimentally that our algorithm performs well and computes succinct policies on a number of POMDP instances from the literature that were naturally enhanced with energy levels.

show abstract

“…There are other approaches to compute policies for infinite-horizon Dec-POMDPs that are not based on a controller representation of the joint-policy (MacDermed & Isbell, 2013). However, a key advantage of policies based on finite-state controllers is their ease of execution in resource constrained environments (Grzes, Poupart, & Hoey, 2013;Grześ, Poupart, Yang, & Hoey, 2015), without any expensive belief update operations required in other approaches. Furthermore, policies represented as finite-state controllers can carry more semantic information, where each controller node summarizes some relevant aspects of the observation history.…”

Section: Related Workmentioning

confidence: 99%

Probabilistic Inference Techniques for Scalable Multiagent Decision Making

Kumar

Zilberstein

Toussaint

2015

jair

View full text Add to dashboard Cite

Decentralized POMDPs provide an expressive framework for multiagent sequential decision making. However, the complexity of these models-NEXP-Complete even for two agents-has limited their scalability. We present a promising new class of approximation algorithms by developing novel connections between multiagent planning and machine learning. We show how the multiagent planning problem can be reformulated as inference in a mixture of dynamic Bayesian networks (DBNs). This planning-as-inference approach paves the way for the application of efficient inference techniques in DBNs to multiagent decision making. To further improve scalability, we identify certain conditions that are sufficient to extend the approach to multiagent systems with dozens of agents. Specifically, we show that the necessary inference within the expectation-maximization framework can be decomposed into processes that often involve a small subset of agents, thereby facilitating scalability. We further show that a number of existing multiagent planning models satisfy these conditions. Experiments on large planning benchmarks confirm the benefits of our approach in terms of runtime and scalability with respect to existing techniques.

show abstract

Controller Compilation and Compression for Resource Constrained Applications

Cited by 5 publications

References 11 publications

Approximate Linear Programming for Constrained Partially Observable Markov Decision Processes

Approximate Linear Programming for Constrained Partially Observable Markov Decision Processes

Stochastic Shortest Path with Energy Constraints in POMDPs

Probabilistic Inference Techniques for Scalable Multiagent Decision Making

Contact Info

Product

Resources

About