A reinforcement learning model with choice traces for a progressive ratio schedule

Ihara, Keiko; Shikano, Yu; Kato, Sae; Yagishita, Sho; Tanaka, Kenji F.; Takata, Norio

doi:10.3389/fnbeh.2023.1302842

Cited by 2 publications

(3 citation statements)

References 68 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Perseveration and action repetition in this context have been related to the functions of dopamine [20,144,[176][177][178][179][180][181] (but see [101,182]) as well as perhaps serotonin [177,183] (but see [101]). The theory here can take into account the roles of dopaminergic systems for not only computations such as the reward-prediction error [184][185][186] but also motivation, vigor, effort, and skillful execution of movement [187][188][189][190][191][192].…”

Section: Bidirectional Hysteretic Biasmentioning

confidence: 99%

“…Like H t (a), its counterpart H t (s t ,a) can also be modeled with the accumulating hysteresis trace [21]. Along with the alternative of a replacing trace (see Methods), another more constrained implementation of hysteretic accumulation could be based on an action-prediction error (or choice-prediction error) with analogy to the reward-prediction error [40,[42][43][44][45][46][47]96,143,144,178,181]. The actionprediction error has been framed as "value-free", but this label and that of H t (s t ,a) as "habit strength" (cf.…”

Section: Plos Computational Biologymentioning

confidence: 99%

“…[79,80,94,95,97,98,101,203,[230][231][232][233][234][235][236][237][238][239][240][241][242]). Thus far, some computational modeling [12,18,19,21,43,44,46,47,96,181,201,243,244] as well as simpler regression analyses with an autoregressive choice kernel or action kernel [17,20,[58][59][60][61]245,246] have yielded differing time courses for hysteretic effects, but such findings tend to not be reported in detail.…”

Section: Dynamics Of Hysteresismentioning

confidence: 99%

See 2 more Smart Citations

Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts

Colas,

O’Doherty,

Grafton

2024

PLoS Comput Biol

View full text Add to dashboard Cite

Active reinforcement learning enables dynamic prediction and control, where one should not only maximize rewards but also minimize costs such as of inference, decisions, actions, and time. For an embodied agent such as a human, decisions are also shaped by physical aspects of actions. Beyond the effects of reward outcomes on learning processes, to what extent can modeling of behavior in a reinforcement-learning task be complicated by other sources of variance in sequential action choices? What of the effects of action bias (for actions per se) and action hysteresis determined by the history of actions chosen previously? The present study addressed these questions with incremental assembly of models for the sequential choice data from a task with hierarchical structure for additional complexity in learning. With systematic comparison and falsification of computational models, human choices were tested for signatures of parallel modules representing not only an enhanced form of generalized reinforcement learning but also action bias and hysteresis. We found evidence for substantial differences in bias and hysteresis across participants—even comparable in magnitude to the individual differences in learning. Individuals who did not learn well revealed the greatest biases, but those who did learn accurately were also significantly biased. The direction of hysteresis varied among individuals as repetition or, more commonly, alternation biases persisting from multiple previous actions. Considering that these actions were button presses with trivial motor demands, the idiosyncratic forces biasing sequences of action choices were robust enough to suggest ubiquity across individuals and across tasks requiring various actions. In light of how bias and hysteresis function as a heuristic for efficient control that adapts to uncertainty or low motivation by minimizing the cost of effort, these phenomena broaden the consilient theory of a mixture of experts to encompass a mixture of expert and nonexpert controllers of behavior.

show abstract

Section: Bidirectional Hysteretic Biasmentioning

confidence: 99%

Section: Plos Computational Biologymentioning

confidence: 99%

Section: Dynamics Of Hysteresismentioning

confidence: 99%

See 1 more Smart Citation

Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts

Colas,

O’Doherty,

Grafton

2024

PLoS Comput Biol

View full text Add to dashboard Cite

show abstract

Sex Differences in Home-Cage Ethanol Drinking and Operant Self-Administration in C57BL/6J Mice with Equivalent Regulation by Glutamate AMPAR Activity

Faccidomo,

Eastman,

Santanam

et al. 2024

Preprint

View full text Add to dashboard Cite

Introduction: Considering sex as a biological variable (SABV) in preclinical research can enhance understanding of the neurobiology of alcohol use disorder (AUD). However, the behavioral and neural mechanisms underlying sex-specific differences remain unclear. This study aims to elucidate SABV in ethanol (EtOH) consumption by evaluating its reinforcing effects and regulation by glutamate AMPA receptor activity in male and female mice. Methods: C57BL/6J mice (male and female) were assessed for EtOH intake under continuous and limited access conditions in the home cage. Acute sensitivity to EtOH sedation and blood clearance were evaluated as potential modifying factors. Motivation to consume EtOH was measured using operant self-administration procedures. Sex-specific differences in neural regulation of EtOH reinforcement were examined by testing the effects of a glutamate AMPA receptor antagonist on operant EtOH self-administration. Results: Female C57BL/6J mice exhibited a time-dependent escalation in EtOH intake under both continuous and limited access conditions. They were less sensitive to EtOH sedation and had lower blood levels post-EtOH administration (4 g/kg) despite similar clearance rates. Females also showed increased operant EtOH self-administration and progressive ratio performance over a 30-day baseline period compared to males. The AMPAR antagonist GYKI 52466 (0 - 10 mg/kg, IP) dose-dependently reduced EtOH-reinforced lever pressing in both sexes, with no differences in potency or efficacy. Discussion: These findings confirm that female C57BL/6J mice consume more EtOH than males in home-cage conditions and exhibit reduced acute sedation, potentially contributing to higher EtOH intake. Females demonstrated increased operant EtOH self-administration and motivation, indicating higher reinforcing efficacy. The lack of sex differences in the relative effects of GYKI 52466 suggests that AMPAR activity is equally required for EtOH reinforcement in both sexes.

show abstract

A reinforcement learning model with choice traces for a progressive ratio schedule

Cited by 2 publications

References 68 publications

Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts

Active reinforcement learning versus action bias and hysteresis: control with a mixture of experts and nonexperts

Sex Differences in Home-Cage Ethanol Drinking and Operant Self-Administration in C57BL/6J Mice with Equivalent Regulation by Glutamate AMPAR Activity

Contact Info

Product

Resources

About