2010
DOI: 10.1523/jneurosci.6249-09.2010
|View full text |Cite
|
Sign up to set email alerts
|

Functional Requirements for Reward-Modulated Spike-Timing-Dependent Plasticity

Abstract: Recent experiments have shown that spike-timing-dependent plasticity is influenced by neuromodulation. We derive theoretical conditions for successful learning of reward-related behavior for a large class of learning rules where Hebbian synaptic plasticity is conditioned on a global modulatory factor signaling reward. We show that all learning rules in this class can be separated into a term that captures the covariance of neuronal firing and reward and a second term that presents the influence of unsupervised… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

6
205
1
1

Year Published

2011
2011
2022
2022

Publication Types

Select...
5
3

Relationship

0
8

Authors

Journals

citations
Cited by 139 publications
(213 citation statements)
references
References 50 publications
6
205
1
1
Order By: Relevance
“…If reinforcement signals (e.g., dopamine) were mediated via causal inverse models instead of being released nonspecifically as assumed in many computational models (46,47), then motor learning could be more efficient, in accordance with model-based reinforcement strategies (48). Indeed, simple reinforcement learning strategies can be enhanced with inverse models as a means to solve the structural credit assignment problem inherent in reinforcement learning (49).…”
Section: Discussionmentioning
confidence: 99%
“…If reinforcement signals (e.g., dopamine) were mediated via causal inverse models instead of being released nonspecifically as assumed in many computational models (46,47), then motor learning could be more efficient, in accordance with model-based reinforcement strategies (48). Indeed, simple reinforcement learning strategies can be enhanced with inverse models as a means to solve the structural credit assignment problem inherent in reinforcement learning (49).…”
Section: Discussionmentioning
confidence: 99%
“…При STDP вес синапса возрастает, если входной спайк пришел до выходного, и убывает, если входной спайк пришел после выходного. Экспоненци-альное забывание сенсорной истории при применении правила минимизации энтропии (4.1) приводит к модуляции изменений весов (3.3) значениями сигнала подкрепления [11] -яв-лению Modulated STDP, также наблюдаемому в биологических нейронах и исследованному в работах [37][38][39].…”
Section: модулированное снижение информационной энтропииunclassified
“…As opposed to the algorithms proposed by Pfeiffer et al (2010), Legenstein et al (2010) and Frémaux et al (2010), the present work neither devises a learning rule for optimal weight tuning nor proposes a new reinforcement learning algorithm. In fact, while reinforcement learning by means of modulated spike-timing dependent plasticity (STDP) was demonstrated in Soula et al (2005), Florian (2007) and Frémaux et al (2010), the primary aim of this work is the exploitation of saturated weights and neural noise to achieve a simple bottom-up implementation of oper-ant reward learning.…”
Section: Introductionmentioning
confidence: 98%
“…The change in modulatory activity has in fact been suggested to regulate the alternation of exploration and exploitation in Krichmar (2008). Thus, while the dynamics of modulated Hebbian plasticity and modulated spike-timedependent plasticity (STDP) have been extensively investigated (Abbott, 1990;Montague et al, 1996;Florian, 2007;Porr and Wörgötter, 2007;Frémaux et al, 2010;Pfeiffer et al, 2010), the novelty of this work is their extension by means of saturation and noise, resulting in a simpler and more fundamental connection between local changes and higher-level simulated behavior. The fundamental properties of the new plasticity model are tested in behavioral tasks employing first a single-neuron model, and later extended to multi-neuron networks.…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation