2016
DOI: 10.3758/s13423-016-1199-y
|View full text |Cite
|
Sign up to set email alerts
|

The drift diffusion model as the choice rule in reinforcement learning

Abstract: Current reinforcement-learning models often assume simplified decision processes that do not fully reflect the dynamic complexities of choice processes. Conversely, sequential-sampling models of decision making account for both choice accuracy and response time, but assume that decisions are based on static decision values. To combine these two computational models of decision making and learning, we implemented reinforcement-learning models in which the drift diffusion model describes the choice process, ther… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

11
419
2

Year Published

2017
2017
2023
2023

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 238 publications
(432 citation statements)
references
References 104 publications
11
419
2
Order By: Relevance
“…In contrast, in perceptual decision-making, sequential sampling models such as the drift diffusion model (DDM) that not only account for the observed choices but also for the full reaction time distributions have a long tradition [8][9][10] . Recent work in reinforcement learning [11][12][13][14] intertemporal 15,16 and simple value-based choice [17][18][19][20] has shown that sequential sampling models can be successfully applied in these domains.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation
“…In contrast, in perceptual decision-making, sequential sampling models such as the drift diffusion model (DDM) that not only account for the observed choices but also for the full reaction time distributions have a long tradition [8][9][10] . Recent work in reinforcement learning [11][12][13][14] intertemporal 15,16 and simple value-based choice [17][18][19][20] has shown that sequential sampling models can be successfully applied in these domains.…”
Section: Introductionmentioning
confidence: 99%
“…Recent studies on reinforcement learning have similarly used accuracy coding when fitting the DDM 13,14 to describe how choices and response time distributions relate to learned action values. In these studies, the upper boundary was defined as a selection of the stimulus with the objectively better reinforcement rate, and the lower boundary as a selection of the objectively inferior stimulus.…”
Section: Introductionmentioning
confidence: 99%
“…To examine group differences in the parameters of interest (β, φ, and ρ) we examined the posterior distributions of the group-level parameter means. Specifically, we report mean posterior group differences, standardized effect sizes for group differences and Bayes Factors testing for directional effects (Marsman & Wagenmakers, 2017;Pedersen, Frank, & Biele, 2017). Directional Bayes Factors (dBF) were computed as dBF = i / 1-i where i is the integral of the posterior distribution of the group difference from 0 to +∞, which we estimated via non-parametric density estimation.…”
Section: Computational Modelingmentioning
confidence: 99%
“…However, most computational models of RL behavior only account for choices but neglect RTs. Therefore, how contextual modulations impact the relations between RTs and accuracy during learning has not been thoroughly explored so far (Summerfield and Tsetsos, 2012, but see Frank et al, 2015;Pedersen et al, 2017).…”
Section: Introductionmentioning
confidence: 99%