Hierarchical Bayesian models of reinforcement learning: Introduction and comparison to alternative methods

Geen, Camilla van; Gerraty, Raphael T.

doi:10.1016/j.jmp.2021.102602

Cited by 7 publications

(5 citation statements)

References 46 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…As a final technical comment, reinforcement learning model parameters were estimated using maximum likelihood techniques on individual subjects followed by model comparison. Future work could expand on this by using hierarchical Bayesian parameter estimation to reduce the variance around parameter estimates ( Piray et al, 2019 ; van Geen and Gerraty, 2021 ; Lee and Newell, 2011 ). However, choosing prior distributions within the hierarchical Bayesian approach is not trivial and may not work for all of the models tested in this study.…”

Section: Discussionmentioning

confidence: 99%

Computational modeling of threat learning reveals links with anxiety and neuroanatomy in humans

Abend

Burk

Ruiz

et al. 2022

eLife

View full text Add to dashboard Cite

Influential theories implicate variations in the mechanisms supporting threat learning in the severity of anxiety symptoms. We use computational models of associative learning in conjunction with structural imaging to explicate links among the mechanisms underlying threat learning, their neuroanatomical substrates, and anxiety severity in humans. We recorded skin-conductance data during a threat-learning task from individuals with and without anxiety disorders (N=251; 8-50 years; 116 females). Reinforcement-learning model variants quantified processes hypothesized to relate to anxiety: threat conditioning, threat generalization, safety learning, and threat extinction. We identified the best-fitting models for these processes and tested associations among latent learning parameters, whole-brain anatomy, and anxiety severity. Results indicate that greater anxiety severity related specifically to slower safety learning and slower extinction of response to safe stimuli. Nucleus accumbens gray-matter volume moderated learning-anxiety associations. Using a modeling approach, we identify computational mechanisms linking threat learning and anxiety severity and their neuroanatomical substrates.

show abstract

Section: Discussionmentioning

confidence: 99%

Computational modeling of threat learning reveals links with anxiety and neuroanatomy in humans

Abend

Burk

Ruiz

et al. 2022

eLife

View full text Add to dashboard Cite

show abstract

“…The present solution of a more comprehensive yet parsimonious model avoids compromising the independence of separate data sets, making it preferable to alternative small-data solutions finding recourse in regularization via fully group-level estimation (i.e., concatenating data sets or averaging parameters) or the intermediate approaches of empirical priors and hierarchical Bayesian modeling across participants [13,29,79,[302][303][304][305]. From an idealized Bayesian-statistical perspective, compromising independence between individuals in this way mitigates putative measurement error from limited data.…”

Section: The Primacy Of Bias and Hysteresis As Well As Individual Dif...mentioning

confidence: 99%

Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts

Colas,

O’Doherty,

Grafton

2024

PLoS Comput Biol

View full text Add to dashboard Cite

Active reinforcement learning enables dynamic prediction and control, where one should not only maximize rewards but also minimize costs such as of inference, decisions, actions, and time. For an embodied agent such as a human, decisions are also shaped by physical aspects of actions. Beyond the effects of reward outcomes on learning processes, to what extent can modeling of behavior in a reinforcement-learning task be complicated by other sources of variance in sequential action choices? What of the effects of action bias (for actions per se) and action hysteresis determined by the history of actions chosen previously? The present study addressed these questions with incremental assembly of models for the sequential choice data from a task with hierarchical structure for additional complexity in learning. With systematic comparison and falsification of computational models, human choices were tested for signatures of parallel modules representing not only an enhanced form of generalized reinforcement learning but also action bias and hysteresis. We found evidence for substantial differences in bias and hysteresis across participants—even comparable in magnitude to the individual differences in learning. Individuals who did not learn well revealed the greatest biases, but those who did learn accurately were also significantly biased. The direction of hysteresis varied among individuals as repetition or, more commonly, alternation biases persisting from multiple previous actions. Considering that these actions were button presses with trivial motor demands, the idiosyncratic forces biasing sequences of action choices were robust enough to suggest ubiquity across individuals and across tasks requiring various actions. In light of how bias and hysteresis function as a heuristic for efficient control that adapts to uncertainty or low motivation by minimizing the cost of effort, these phenomena broaden the consilient theory of a mixture of experts to encompass a mixture of expert and nonexpert controllers of behavior.

show abstract

“…To analyze learning computationally, model fitting was carried out independently for each participant using Bayesian parameter optimization methods, implemented in STAN 38 . We fitted a classic Q-learning model to choice data, based on the same principles as Hertz et al (2021), as follows:…”

Section: (E) Analysesmentioning

confidence: 99%

Experience and advice consequences shape information sharing strategies

Anllo,

Salamander,

Palminteri

et al. 2024

Preprint

View full text Add to dashboard Cite

Individuals often rely on the advice of more experienced peers to minimize uncertainty and increase success likelihood. In most domains where knowledge is acquired through experience, advisers are themselves continuously learning. Here we examine the way advising behavior changes throughout the learning process, and the way that costs and benefits of giving advice shape this behavior. We ran a series of experiments implementing a decision task within a reinforcement learning framework, where participants could decide to share their choices as advice to others. Participants were overall likely to share their choices as advice, even on the first trial before learning. Tendency to share advice and advice quality increased as advisers learned about the value of choices, and moved from exploratory to exploitative behavior. The introduction of consequences to advising resulted in a shift of the overall tendency to give advice, lowering it when advising implicated monetary loss, and increasing it when advising held reputational value. Individual differences in social anxiety levels were associated with lower tendency to share exploratory decisions. Our results show that advisers tend to share choices that are backed by their own experience, but that this relationship can be altered by advice-consequences and individual traits.

show abstract

Hierarchical Bayesian models of reinforcement learning: Introduction and comparison to alternative methods

Cited by 7 publications

References 46 publications

Computational modeling of threat learning reveals links with anxiety and neuroanatomy in humans

Computational modeling of threat learning reveals links with anxiety and neuroanatomy in humans

Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts

Experience and advice consequences shape information sharing strategies

Contact Info

Product

Resources

About