2016
DOI: 10.1073/pnas.1609094113
|View full text |Cite
|
Sign up to set email alerts
|

Adaptive integration of habits into depth-limited planning defines a habitual-goal–directed spectrum

Abstract: Behavioral and neural evidence reveal a prospective goal-directed decision process that relies on mental simulation of the environment, and a retrospective habitual process that caches returns previously garnered from available choices. Artificial systems combine the two by simulating the environment up to some depth and then exploiting habitual values as proxies for consequences that may arise in the further future. Using a three-step task, we provide evidence that human subjects use such a normative plan-unt… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

2
203
0

Year Published

2017
2017
2022
2022

Publication Types

Select...
4
2
2

Relationship

1
7

Authors

Journals

citations
Cited by 189 publications
(209 citation statements)
references
References 38 publications
2
203
0
Order By: Relevance
“…This implies that both mappings were represented during each trial, demonstrating the existence of multiple components of learning. The relative expression of different components of learning has previously been shown to be influenced by limiting cognitive resources 26,27 , including available preparation time 17,20,22,28,29 . However, previous research has manipulated preparation time in a relatively simple 'high-or-low' manner 17,20 , or based on spontaneous variations in 'voluntarily' selected reaction times 29 .…”
Section: Limiting Reaction Times Unmasks Habitual Behaviormentioning
confidence: 99%
See 1 more Smart Citation
“…This implies that both mappings were represented during each trial, demonstrating the existence of multiple components of learning. The relative expression of different components of learning has previously been shown to be influenced by limiting cognitive resources 26,27 , including available preparation time 17,20,22,28,29 . However, previous research has manipulated preparation time in a relatively simple 'high-or-low' manner 17,20 , or based on spontaneous variations in 'voluntarily' selected reaction times 29 .…”
Section: Limiting Reaction Times Unmasks Habitual Behaviormentioning
confidence: 99%
“…A habitually selected response might be only transiently prepared, and later replaced by a more deliberately determined response. Indeed, limiting preparation time has proven to be an effective means of prohibiting deliberate, goal-directed processes from influencing behavior 17,[20][21][22] . We therefore predicted that imposing limited preparation time would unmask such latent habitually selected responses.…”
Section: Introductionmentioning
confidence: 99%
“…There is now a large body of work on the combination of MF and MS influences, and a collection of forms of MS reasoning, even just in the various versions of the task we studied (8,10,14,16,(27)(28)(29). In addition to revealing fundamental features of behavioural strategies, changes in MS influences have been associated with various psychiatric (30)(31)(32)(33), neurological (34) and genetic, pharmacological or stimulationinduced manipulations (35)(36)(37)(38).…”
Section: Discussionmentioning
confidence: 99%
“…This originally inspired ideas that their output should be combined (8). Recently, rather complex patterns of interaction have been investigated, including MB training of MF (9,10), MF control over MB calculations (11)(12)(13), the incorporation of MF values into MB calculations (14) and, of particular relevance for the present study, the creation of sophisticated, model-dependent, representations of the task that enable MF methods to work more efficiently (15), and potentially less susceptible to distraction (16). We deem these various interactions model-sensitive (MS), saving model-based for the original notion of prospective planning.…”
Section: Introductionmentioning
confidence: 99%
“…When environmental constraints are more lenient and allow additional processing time, a more refined option can be computed and lead the decision maker toward the maximal reward. Hence, a trade-off exists between a costly evidence integration process and obtaining a maximal reward (18). Third, in natural environments the choice reflecting the maximal reward and the choice reflecting the maximal average or addition of rewards will often be the same.…”
Section: Discussionmentioning
confidence: 99%