2021
DOI: 10.7554/elife.63055
|View full text |Cite
|
Sign up to set email alerts
|

A new model of decision processing in instrumental learning tasks

Abstract: Learning and decision making are interactive processes, yet cognitive modelling of error-driven learning and decision making have largely evolved separately. Recently, evidence accumulation models (EAMs) of decision making and reinforcement learning (RL) models of error-driven learning have been combined into joint RL-EAMs that can in principle address these interactions. However, we show that the most commonly used combination, based on the diffusion decision model (DDM) for binary choice, consistently fails … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

2
52
1

Year Published

2021
2021
2024
2024

Publication Types

Select...
7
3

Relationship

3
7

Authors

Journals

citations
Cited by 49 publications
(55 citation statements)
references
References 109 publications
2
52
1
Order By: Relevance
“…However, because our model does not contain a learning mechanism, it does not explain how individuals acquire the cognitive control settings (and timing thereof) that they use to meet task demands. Future work may thus profit from integrating a PBWM-like learning mechanism into our evidence accumulation framework to obtain finer control over the temporal dynamics of referenceback performance (e.g., by having threshold and/or reactive control settings vary from trial to trial as a function of learning; see [65,66] for examples of such an approach in the domain of instrumental learning). In addition, we speculate that some of the minor misfits (e.g., to empirical switching and comparison costs) of our model were likely due to certain sequential or 'carry-over' effects that are unaccounted for in the current framework, such as proactive interference, priming, task-set inertia/reconfiguration, and Gratton effects arising from previously encountered stimuli and responses [11,42,43,62,[67][68][69][70][71][72].…”
Section: Discussionmentioning
confidence: 99%
“…However, because our model does not contain a learning mechanism, it does not explain how individuals acquire the cognitive control settings (and timing thereof) that they use to meet task demands. Future work may thus profit from integrating a PBWM-like learning mechanism into our evidence accumulation framework to obtain finer control over the temporal dynamics of referenceback performance (e.g., by having threshold and/or reactive control settings vary from trial to trial as a function of learning; see [65,66] for examples of such an approach in the domain of instrumental learning). In addition, we speculate that some of the minor misfits (e.g., to empirical switching and comparison costs) of our model were likely due to certain sequential or 'carry-over' effects that are unaccounted for in the current framework, such as proactive interference, priming, task-set inertia/reconfiguration, and Gratton effects arising from previously encountered stimuli and responses [11,42,43,62,[67][68][69][70][71][72].…”
Section: Discussionmentioning
confidence: 99%
“…However, we don't make choices in a vacuum, and our current choices depend on previous choices we have made (Erev & Roth, 2014;Keung, Hagen, & Wilson, 2019;Talluri et al, 2020;Urai, Braun, & Donner, 2017;Urai, de Gee, Tsetsos, & Donner, 2019). One natural way in which choices influence each other is through learning about the options, where the evaluations of the outcome of one choice refines the expected value (incorporating range and probability) assigned to that option in future choices (Fontanesi, Gluth, et al, 2019;Fontanesi, Palminteri, et al, 2019;Miletic et al, 2021). Here we focus on a different, complementary way, central to cognitive control research, where evaluations of the process of ongoing and past choices inform the process of future choices (Botvinick et al, 1999;Bugg, Jacoby, & Chanani, 2011;Verguts, Vassena, & Silvetti, 2015).…”
Section: Exerting Control Beyond Our Current Choicementioning
confidence: 99%
“…This framework can be contrasted with the standard assumption of many diffusion-based models, which track only the balance of evidence -information favoring one relative to another option. Theories producing graded estimates based only on the balance of evidence between two options can fail to produce magnitude effects (Teodorescu et al, 2016;Miletić et al, 2021) where increasing the magnitude of both stimuli / choice options (e.g., making both more coherent or easy to see) while maintaining the balance between them speeds up response times (but see Ratcliff et al, 2018, for an approach to this issue that makes variability in the balance of evidence proportional to its mean). Having two dimensions to the evidence accumulation process allows the model to capture the magnitude effect as well as typical difference effects, where adjusting the balance of evidence by manipulating the ratio of support for two options mainly affects the responses that are given rather than response times (Vickers, 2001;.…”
Section: Model Overviewmentioning
confidence: 99%