Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes

2019 IEEE 58th Conference on Decision and Control (CDC)

2019

Self Cite

Information-theoretic bounded rationality describes utility-optimizing decision-makers whose limited information-processing capabilities are formalized by information constraints. One of the consequences of bounded rationality is that resource-limited decision-makers can join together to solve decision-making problems that are beyond the capabilities of each individual. Here, we study an information-theoretic principle that drives division of labor and specialization when decision-makers with information constraints are joined together. We devise an on-line learning rule of this principle that learns a partitioning of the problem space such that it can be solved by specialized linear policies. We demonstrate the approach for decision-making problems whose complexity exceeds the capabilities of individual decision-makers, but can be solved by combining the decision-makers optimally. The strength of the model is that it is abstract and principled, yet has direct applications in classification, regression, reinforcement learning and adaptive control.

Section: Discussionmentioning

confidence: 99%

An Information-theoretic On-line Learning Principle for Specialization in Hierarchical Decision-Making Systems

Hihn

Gottwald

2019 IEEE 58th Conference on Decision and Control (CDC)

2019

Self Cite

“…Our analysis may have interesting implications, as it provides a normative framework for this kind of combined optimization of adaptive priors and decision-making processes. Prior to our work there have been several attempts to apply the framework of information-theoretic bounded rationality to machine learning tasks [7,11,12,18]. The novelty of our approach is that we design adaptive priors for both the single-step case and the multi-agent case and we demonstrate how to transform information-theoretic constraints into computational constraints in the form of MCMC steps.…”

Section: Discussionmentioning

confidence: 99%

Bounded Rational Decision-Making with Adaptive Neural Network Priors

Hihn

Gottwald

Artificial Neural Networks in Pattern Recognition

2018

Self Cite

Bounded rationality investigates utility-optimizing decisionmakers with limited information-processing power. In particular, information theoretic bounded rationality models formalize resource constraints abstractly in terms of relative Shannon information, namely the Kullback-Leibler Divergence between the agents' prior and posterior policy. Between prior and posterior lies an anytime deliberation process that can be instantiated by sample-based evaluations of the utility function through Markov Chain Monte Carlo (MCMC) optimization. The most simple model assumes a fixed prior and can relate abstract informationtheoretic processing costs to the number of sample evaluations. However, more advanced models would also address the question of learning, that is how the prior is adapted over time such that generated prior proposals become more efficient. In this work we investigate generative neural networks as priors that are optimized concurrently with anytime sample-based decision-making processes such as MCMC. We evaluate this approach on toy examples.

“…In [ 33 ] and similarly in [ 20 ], the authors derive the relative entropy as a control cost from an information-theoretic point of view, under axioms of monotonicity and invariance under relabelling and decomposition. In other fields such as robotics, the relative entropy has also been used as a control cost [ 18 , 21 , 25 , 58 , 73 , 74 ] to regularize the behaviour of the controller by penalizing controls that are far from the uncontrolled dynamics of the system or to deal with model uncertainty [ 75 ]. Naturally, questions regarding the generality of entropic costs as information-processing costs and their potential relation to algorithmic space-time resource constraints carry over to the non-equilibrium scenario and remain a topic for future investigations.…”

Section: Discussionmentioning

confidence: 99%

Non-Equilibrium Relations for Bounded Rational Decision-Making in Changing Environments

Grau-Moya

Krüger

2017

Entropy

Self Cite

Abstract:Living organisms from single cells to humans need to adapt continuously to respond to changes in their environment. The process of behavioural adaptation can be thought of as improving decision-making performance according to some utility function. Here, we consider an abstract model of organisms as decision-makers with limited information-processing resources that trade off between maximization of utility and computational costs measured by a relative entropy, in a similar fashion to thermodynamic systems undergoing isothermal transformations. Such systems minimize the free energy to reach equilibrium states that balance internal energy and entropic cost. When there is a fast change in the environment, these systems evolve in a non-equilibrium fashion because they are unable to follow the path of equilibrium distributions. Here, we apply concepts from non-equilibrium thermodynamics to characterize decision-makers that adapt to changing environments under the assumption that the temporal evolution of the utility function is externally driven and does not depend on the decision-maker's action. This allows one to quantify performance loss due to imperfect adaptation in a general manner and, additionally, to find relations for decision-making similar to Crooks' fluctuation theorem and Jarzynski's equality. We provide simulations of several exemplary decision and inference problems in the discrete and continuous domains to illustrate the new relations.