Neural activity in the primary motor cortex (M1) is known to correlate with movement related variables including kinematics and dynamics. Our recent work, which we believe is part of a paradigm shift in sensorimotor research, has shown that in addition to these movement related variables, activity in M1 and the primary somatosensory cortex (S1) are also modulated by context, such as value, during both active movement and movement observation. Here we expand on the investigation of reward modulation in M1, showing that reward level changes the neural tuning function of M1 units to both kinematic as well as dynamic related variables. In addition, we show that this reward-modulated activity is present during brain machine interface (BMI) control. We suggest that by taking into account these context dependencies of M1 modulation, we can produce more robust BMIs. Toward this goal, we demonstrate that we can classify reward expectation from M1 on a movement-by-movement basis under BMI control and use this to gate multiple linear BMI decoders toward improved offline performance. These findings demonstrate that it is possible and meaningful to design a more accurate BMI decoder that takes reward and context into consideration. Our next step in this development will be to incorporate this gating system, or a continuous variant of it, into online BMI performance.
Reward modulation (M1) could be exploited in developing an autonomously updating brain-computer interface (BCI) based on a reinforcement learning (RL) architecture. For an autonomously updating RL-based BCI system, we would need a reward prediction error, or a state-value representation from the user’s neural activity, which the RL-BCI agent could use to update its BCI decoder. In order to understand the multifaceted effects of reward on M1 activity, we investigated how neural spiking, oscillatory activities and their functional interactions are modulated by conditioned stimuli related reward expectation. To do so, local field potentials (LFPs) and single/multi-unit activities were recorded simultaneously and bilaterally from M1 cortices while four non-human primates (NHPs) performed cued center-out reaching or grip force tasks either manually using their right arm/hand or observed passively. We found that reward expectation influenced the strength of α (8–14 Hz) power, α-γ comodulation, α spike-field coherence (SFC), and firing rates (FRs) in general in M1. Furthermore, we found that an increase in α-band power was correlated with a decrease in neural spiking activity, that FRs were highest at the trough of the α-band cycle and lowest at the peak of its cycle. These findings imply that α oscillations modulated by reward expectation have an influence on spike FR and spike timing during both reaching and grasping tasks in M1. These LFP, spike, and spike-field interactions could be used to follow the M1 neural state in order to enhance BCI decoding ( An et al., 2018 ; Zhao et al., 2018 ).
Encoding of reward valence has been shown in various brain regions, including deep structures such as the substantia nigra as well as cortical structures such as the orbitofrontal cortex. While the correlation between these signals and reward valence have been shown in aggregated data comprised of many trials, little work has been done investigating the feasibility of decoding reward valence on a single trial basis. Towards this goal, one non-human primate (macaca radiata) was trained to grip and hold a target level of force in order to earn zero, one, two, or three juice rewards. The animal was informed of the impending result before reward delivery by means of a visual cue. Neural data was recorded from primary somatosensory cortex (S1) during these experiments and firing rate histograms were created following the appearance of the visual cue and used as input to a variety of classifiers. Reward valence was decoded with high levels of accuracy from S1 both in the post-cue and post-reward periods. Additionally, the proportion of units showing significant changes in their firing rates was influenced in a predictable way based on reward valence. The existence of a signal within S1 cortex that encodes reward valence could have utility for implementing reinforcement learning algorithms for brain machine interfaces. The ability to decode this reward signal in real time with limited data is paramount to the usability of such a signal in practical applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.