Theories of reward learning in neuroscience have focused on two families of algorithms, thought to capture deliberative vs. habitual choice. "Model-based" algorithms compute the value of candidate actions from scratch, whereas "model-free" algorithms make choice more efficient but less flexible by storing pre-computed action values. We examine an intermediate algorithmic family, the successor representation (SR), which balances flexibility and efficiency by storing partially computed action values: predictions about future events. These pre-computation strategies differ in how they update their choices following changes in a task. SR's reliance on stored predictions about future states predicts a unique signature of insensitivity to changes in the task's sequence of events, but flexible adjustment following changes to rewards. We provide evidence for such differential sensitivity in two behavioral studies with humans. These results suggest that the SR is a computational substrate for semi-flexible choice in humans, introducing a subtler, more cognitive notion of habit.
How we process ongoing experiences is shaped by our personal history, current needs, and future goals. Consequently, ventromedial prefrontal cortex (vmPFC) activity involved in processing these subjective appraisals appears to be highly idiosyncratic across individuals. To elucidate the role of the vmPFC in processing our ongoing experiences, we developed a computational framework and analysis pipeline to characterize the spatiotemporal dynamics of individual vmPFC responses as participants viewed a 45-minute television drama. Through a combination of functional magnetic resonance imaging, facial expression tracking, and self-reported emotional experiences across four studies, our data suggest that the vmPFC slowly transitions through a series of discretized states that broadly map onto affective experiences. Although these transitions typically occur at idiosyncratic times across people, participants exhibited a marked increase in state alignment during high affectively valenced events in the show. Our work suggests that the vmPFC ascribes affective meaning to our ongoing experiences.
Word count: 4872Abstract How we process ongoing experiences is shaped by our personal history, current needs, and future goals. Consequently, brain regions involved in generating these subjective appraisals, such as the vmPFC, often appear to be heterogeneous across individuals even in response to the same external information. To elucidate the role of the vmPFC in processing our ongoing experiences, we developed a computational framework and analysis pipeline to characterize the spatiotemporal dynamics of individual vmPFC responses as participants viewed a 45-minute television drama. Through a combination of functional magnetic resonance imaging, facial expression tracking, and self-reported emotional experiences across four studies, our data suggest that the vmPFC slowly transitions through a series of discretized states that broadly map onto affective experiences. Although these transitions typically occur at idiosyncratic times across people, participants exhibited a marked increase in state alignment during high affectively valenced events in the show. Our work suggests that the vmPFC ascribes affective meaning to our ongoing experiences.
Time is an extremely valuable resource but little is known about the efficiency of time allocation in decision-making. Empirical evidence suggests that in many ecologically relevant situations, decision difficulty and the relative reward from making a correct choice, compared to an incorrect one, are inversely linked, implying that it is optimal to use relatively less time for difficult choice problems. This applies, in particular, to value-based choices, in which the relative reward from choosing the higher valued item shrinks as the values of the other options get closer to the best option and are thus more difficult to discriminate. Here, we experimentally show that people behave sub-optimally in such contexts. They do not respond to incentives that favour the allocation of time to choice problems in which the relative reward for choosing the best option is high; instead they spend too much time on problems in which the reward difference between the options is low. We demonstrate this by showing that it is possible to improve subjects' time allocation with a simple intervention that cuts them off when their decisions take too long. Thus, we provide a novel form of evidence that organisms systematically spend their valuable time in an inefficient way, and simultaneously offer a potential solution to the problem.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.