2021
DOI: 10.31234/osf.io/dpqj6
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Choice-confirmation bias and gradual perseveration in human reinforcement learning

Abstract: Do we preferentially learn from outcomes that confirm our choices? This is one of the most basic, and yet consequence-bearing, questions concerning reinforcement learning. In recent years, we investigated this question in a series of studies implementing increasingly complex behavioral protocols. The learning rates fitted in experiments featuring partial or complete feedback, as well as free and forced choices, were systematically found to be consistent with a choice-confirmation bias. This result is robust ac… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
3
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
1

Relationship

0
4

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 48 publications
2
3
0
Order By: Relevance
“…In theory, inflexibility can also be captured by asymmetric learning for positive and negative reward prediction errors. That is, excessive learning from better-than-expected experiences and/or deficient learning from worse-than-expected experiences may result in the repetitive choice of the same option [42,43]. Consistent with the theoretical consideration, we demonstrated that, in reward-seeking decision-making, the learning rate for negative reward prediction errors was lower in patients with OCD than in healthy controls.…”
Section: Discussionsupporting
confidence: 84%
See 2 more Smart Citations
“…In theory, inflexibility can also be captured by asymmetric learning for positive and negative reward prediction errors. That is, excessive learning from better-than-expected experiences and/or deficient learning from worse-than-expected experiences may result in the repetitive choice of the same option [42,43]. Consistent with the theoretical consideration, we demonstrated that, in reward-seeking decision-making, the learning rate for negative reward prediction errors was lower in patients with OCD than in healthy controls.…”
Section: Discussionsupporting
confidence: 84%
“…Yet, previous studies in computational psychiatry have often failed to detect increased perseveration in OCD and PG patients [37][38][39][40][41]. An alternative account posits that inflexibility reflects asymmetric RL, i.e., overlearning from better-than-expected outcomes and underlearning from worse-than-expected outcomes [42,43]. In formal terms, the learning rate-the extent to which new information (e.g., reward, loss, and neutral outcome) modulates future behaviour-is different for value updating from positive and negative reward prediction errors.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…That is, there was a larger influence of inverse temperature, 𝛽, than the perseveration size, 𝜙, on decision-making. This finding is in accordance with that of Palminteri (2021), but as he also pointed out, the type of task used may influence which parameter has a larger impact on choices.…”
Section: Relationship Between Value Calculation and Choice Perseverationsupporting
confidence: 88%
“…That is, as learning promotes consistent repetition of responses within a state, so too can autocorrelational effects of hysteresis producing response repetition or alternation that coincidentally aligns with rotating states. (For example, perseveration offers a more parsimonious explanation for action repetition that could otherwise be attributed to an optimistic confirmation bias (Frank et al, 2004;Sharot, 2011;Sharot et al, 2011;Thorndike, 1932Thorndike, , 1933; in RL terms, the latter could translate to an asymmetry in learning rates favoring positive over negative outcomes (Cazé & van der Meer, 2013;Daw et al, 2002;Frank et al, 2007Frank et al, , 2009Niv et al, 2012)-but at the cost of susceptibility to overfitting (relative to hysteresis) (Chambon et al, 2020;Gershman, 2016;Katahira, 2015Katahira, , 2018Palminteri, 2021;Sugawara & Katahira, 2021).) The baseline hysteresis model includes a dynamic perseveration (or alternation) bias β t (a) (cf.…”
Section: Computational Modeling: Generalized Reinforcement Learningmentioning
confidence: 99%