When modeling goal-directed behavior in the presence of various sources of uncertainty, planning can be described as an inference process. A solution to the problem of planning as inference was previously proposed in the active inference framework in the form of an approximate inference scheme based on variational free energy. However, this approximate scheme was based on the mean-field approximation, which assumes statistical independence of hidden variables and is known to show overconfidence and may converge to local minima of the free energy. To better capture the spatiotemporal properties of an environment, we reformulated the approximate inference process using the so-called Bethe approximation. Importantly, the Bethe approximation allows for representation of pairwise statistical dependencies. Under these assumptions, the minimizer of the variational free energy corresponds to the belief propagation algorithm, commonly used in machine learning. To illustrate the differences between the mean-field approximation and the Bethe approximation, we have simulated agent behavior in a simple goal-reaching task with different types of uncertainties. Overall, the Bethe agent achieves higher success rates in reaching goal states. We relate the better performance of the Bethe agent to more accurate predictions about the consequences of its own actions. Consequently, active inference based on the Bethe approximation extends the application range of active inference to more complex behavioral tasks.
IMPORTANCEAlcohol consumption (AC) leads to death and disability worldwide. Ongoing discussions on potential negative effects of the COVID-19 pandemic on AC need to be informed by real-world evidence. OBJECTIVE To examine whether lockdown measures are associated with AC and consumptionrelated temporal and psychological within-person mechanisms.
In everyday life, our behavior varies on a continuum from either automatic and habitual to deliberate and goal-directed. Recent evidence suggests that habit formation and relearning of habits operate in a context-dependent manner: Habit formation is promoted when actions are performed in a specific context, while breaking off habits is facilitated after a context change.It is an open question how one can computationally model the brain's balancing between context-specific habits and goal-directed actions. Here, we propose a hierarchical Bayesian approach for control of a partially observable Markov decision process that enables conjoint learning of habit and reward structure in a context-specific manner. In this model, habit learning corresponds to a value-free updating of priors over policies and interacts with the value-based learning of the reward structure. Importantly, the model is solely built on probabilistic inference, which effectively provides a simple explanation how the brain may balance contributions of habitual and goal-directed control. We illustrated the resulting behavior using agent-based simulated experiments, where we replicated several findings of devaluation and extinction experiments. In addition, we show how a single parameter, the so-called habitual tendency, can explain individual differences in habit learning and the balancing between habitual and goal-directed control. Finally, we discuss the relevance of the proposed model for understanding specific phenomena in substance use disorder and the potential computational role of activity in dorsolateral and dorsomedial striatum and infralimbic cortex, as reported in animal experiments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.