We studied the choice behavior of 2 monkeys in a discrete-trial task with reinforcement contingencies similar to those Herrnstein (1961) used when he described the matching law. In each session, the monkeys experienced blocks of discrete trials at different relative-reinforcer frequencies or magnitudes with unsignalled transitions between the blocks. Steady-state data following adjustment to each transition were well characterized by the generalized matching law; response ratios undermatched reinforcer frequency ratios but matched reinforcer magnitude ratios. We modelled response-by-response behavior with linear models that used past reinforcers as well as past choices to predict the monkeys' choices on each trial. We found that more recently obtained reinforcers more strongly influenced choice behavior. Perhaps surprisingly, we also found that the monkeys' actions were influenced by the pattern of their own past choices. It was necessary to incorporate both past reinforcers and past choices in order to accurately capture steady-state behavior as well as the fluctuations during block transitions and the response-by-response patterns of behavior. Our results suggest that simple reinforcement learning models must account for the effects of past choices to accurately characterize behavior in this task, and that models with these properties provide a conceptual tool for studying how both past reinforcers and past choices are integrated by the neural systems that generate behavior.
Choosing the most valuable course of action requires knowing the outcomes associated with the available alternatives. The striatum may be important for representing the values of actions. We examined this in monkeys performing an oculomotor choice task. The activity of phasically active neurons (PANs) in the striatum covaried with two classes of information: action-values and chosen-values. Action-value PANs were correlated with value estimates for one of the available actions, and these signals were frequently observed before movement execution. Chosen-value PANs were correlated with the value of the action that had been chosen, and these signals were primarily observed later in the task, immediately before or persistently after movement execution. These populations may serve distinct functions mediated by the striatum: some PANs may participate in choice by encoding the values of the available actions, while other PANs may participate in evaluative updating by encoding the reward value of chosen actions.
Making appropriate choices often requires the ability to learn the value of available options from experience. Parkinson’s disease is characterized by a loss of dopamine neurons in the substantia nigra, neurons hypothesized to play a role in reinforcement learning. Although previous studies have shown that Parkinson’s patients are impaired in tasks involving learning from feedback, they have not directly tested the widely held hypothesis that dopamine neuron activity specifically encodes the reward prediction error signal used in reinforcement learning models. To test a key prediction of this hypothesis, we fit choice behavior from a dynamic foraging task with reinforcement learning models and show that treatment with dopaminergic drugs alters choice behavior in a manner consistent with the theory. More specifically, we found that dopaminergic drugs selectively modulate learning from positive outcomes. We observed no effect of dopaminergic drugs on learning from negative outcomes. We also found a novel dopamine-dependent effect on decision making that is not accounted for by reinforcement learning models: perseveration in choice, independent of reward history, increases with Parkinson’s disease and decreases with dopamine therapy.
A crucial step in understanding the function of a neural circuit in visual processing is to know what stimulus features are represented in the spiking activity of the neurons. For neurons with complex, nonlinear response properties, characterization of feature representation requires measurement of their responses to a large ensemble of visual stimuli and an analysis technique that allows identification of relevant features in the stimuli. In the present study, we recorded the responses of complex cells in the primary visual cortex of the cat to spatiotemporal random-bar stimuli and applied spike-triggered correlation analysis of the stimulus ensemble. For each complex cell, we were able to isolate a small number of relevant features from a large number of null features in the random-bar stimuli. Using these features as visual stimuli, we found that each relevant feature excited the neuron effectively in isolation and contributed to the response additively when combined with other features. In contrast, the null features evoked little or no response in isolation and divisively suppressed the responses to relevant features. Thus, for each cortical complex cell, visual inputs can be decomposed into two distinct types of features (relevant and null), and additive and divisive interactions between these features may constitute the basic operations in visual cortical processing.
The orbitofrontal cortex (OFC) and amygdala are thought to participate in reversal learning, a process in which cue-outcome associations are switched. However, current theories disagree on whether OFC directs reversal learning in the amygdala. Here, we show that during reversal of cues' associations with rewarding and aversive outcomes, neurons that respond preferentially to stimuli predicting aversive events update more quickly in amygdala than OFC; meanwhile, OFC neurons that respond preferentially to reward-predicting stimuli update more quickly than those in the amygdala. After learning, however, OFC consistently differentiates between impending reinforcements with a shorter latency than the amygdala. Finally, analysis of local field potentials (LFPs) reveals a disproportionate influence of OFC on amygdala that emerges after learning. We propose that reversal learning is supported by complex interactions between neural circuits spanning the amygdala and OFC, rather than directed by any single structure.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.