Reward-based training of recurrent neural networks for cognitive and value-based tasks

Song, Hui; Yang, Guangyu Robert; Wang, Xiaojing

doi:10.1101/070375

Cited by 25 publications

(45 citation statements)

References 81 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Moreover, trial-by-trial variability in the activity of each group of neurons correlates with variability in choices (Padoa-Schioppa, 2013;Conen and Padoa-Schioppa, 2015). Computational models show that the cell groups identified in OFC are sufficient to generate binary choices (Rustichini and Padoa-Schioppa, 2015;Friedrich and Lengyel, 2016;Song et al, 2017;Zhang et al, 2018), and the population dynamics is consistent with decision making (Rich and Wallis, 2016). These results suggest that economic decisions might be formed within the OFC (Padoa-Schioppa and , but causality has not been established.…”

Section: Introductionmentioning

confidence: 99%

Values Encoded in Orbitofrontal Cortex Are Causally Related to Economic Choices

Ballesta

Shi

Conen

et al. 2020

Preprint

View full text Add to dashboard Cite

It has long been hypothesized that economic choices rely on the assignment and comparison of subjective values. Indeed, when agents make decisions, neurons in orbitofrontal cortex encode the values of offered and chosen goods. Moreover, neuronal activity in this area suggests the formation of a decision. However, it is unclear whether these neural processes are causally related to choices. More generally, the evidence linking economic choices to value signals in the brain remains correlational. We address this fundamental issue using electrical stimulation in rhesus monkeys. We show that suitable currents bias choices by increasing the value of individual offers. Furthermore, high-current stimulation disrupts both the computation and the comparison of subjective values. These results demonstrate that values encoded in orbitofrontal cortex are causal to economic choices.

show abstract

Section: Introductionmentioning

confidence: 99%

Values Encoded in Orbitofrontal Cortex Are Causally Related to Economic Choices

Ballesta

Shi

Conen

et al. 2020

Preprint

View full text Add to dashboard Cite

show abstract

“…Furthermore, neuronal dynamics in OFC during economic decisions reflect an internal deliberation 16 . Complementing these experimental findings, theoretical work showed that neural networks whose units match the cell groups identified in OFC can generate binary decisions ( Fig.1ab) [17][18][19][20][21] . Collectively, these results appear to lay the foundations for a satisfactory understanding of the mechanisms underlying economic decisions.…”

Section: Introductionmentioning

confidence: 69%

Mechanisms of Economic Decisions under Sequential Offers

Ballesta

Padoa-Schioppa

2019

Preprint

View full text Add to dashboard Cite

Manuscript information: 160 words in abstract, 4690 words in main text, 7 figures, 1 table, supplementary information in separate file. AbstractBinary choices between goods are thought to take place in orbitofrontal cortex (OFC). However, current notions emerged mostly from studies where two offers were presented simultaneously. Other work suggested that choices under sequential offers rely on fundamentally different mechanisms. Here we recorded from the OFC of macaques choosing between two juices offered sequentially. Analyzing neuronal responses across time windows, we discovered different groups of neurons that closely resemble those identified under simultaneous offers, suggesting that decisions in the two modalities are formed in the same neural circuit. Building on this result, we examined five hypotheses on the decision mechanisms. OFC neurons encoded goods and values in a juice-based representation (labeled lines). Contrary to previous assessments, decisions did not involve mutual inhibition between pools of offer value cells. Instead, decisions involved mechanisms of circuit inhibition, whereby each offer value indirectly inhibited neurons encoding the opposite choice outcome. These results reconcile seemingly disparate findings and provide a unitary account for economic decisions.

show abstract

“…September 18, 2019 26/35 Clopath, 2017]. Another approach may be to use a reinforcement learning paradigm, rather than the gradient of an error signal, to train the network system [Song et al, 2017]. However, the latent dynamics underlying task learning uncovered here evolve over a longer timescale than the learning dynamics underlying network training, so findings are likely not impacted by different training protocols.…”

Section: Discussionmentioning

confidence: 99%

Learning to select actions shapes recurrent dynamics in the corticostriatal system

Márton

Schultz

Averbeck

2019

Preprint

View full text Add to dashboard Cite

Learning to select appropriate actions based on their values is fundamental to adaptive behavior. This form of learning is supported by fronto-striatal systems. The dorsal-lateral prefrontal cortex (dlPFC) and the dorsal striatum (dSTR), which are strongly interconnected, are key nodes in this circuitry. Substantial experimental evidence, including neurophysiological recordings, have shown that neurons in these structures represent key aspects of learning. The computational mechanisms that shape the neurophysiological responses, however, are not clear. To examine this, we developed a recurrent neural network (RNN) model of the dlPFC-dSTR circuit and trained it on an oculomotor sequence learning task. We compared the activity generated by the model to activity recorded from monkey dlPFC and dSTR in the same task. This network consisted of a striatal component which encoded action values, and a prefrontal component which selected appropriate actions. After training, this system was able to autonomously represent and update action values and select actions, thus being able to closely approximate the representational structure in corticostriatal recordings. We found that learning to select the correct actions drove action-sequence representations further apart in activity space, both in the model and in the neural data. The model revealed that learning proceeds by increasing the distance between sequence-specific representations. This makes it more likely that the model will select the appropriate action sequence as learning develops. Our model thus supports the hypothesis that learning in networks drives the neural representations of actions further apart, increasing the probability that the network generates correct actions as learning proceeds. Altogether, this study advances our understanding of how neural circuit dynamics are involved in neural computation, showing how dynamics in the corticostriatal system support task learning.

show abstract

Reward-based training of recurrent neural networks for cognitive and value-based tasks

Cited by 25 publications

References 81 publications

Values Encoded in Orbitofrontal Cortex Are Causally Related to Economic Choices

Values Encoded in Orbitofrontal Cortex Are Causally Related to Economic Choices

Mechanisms of Economic Decisions under Sequential Offers

Learning to select actions shapes recurrent dynamics in the corticostriatal system

Contact Info

Product

Resources

About