Functional Requirements for Reward-Modulated Spike-Timing-Dependent Plasticity

Frémaux, Nicolas; Sprekeler, Henning; Gerstner, Wulfram

doi:10.1523/jneurosci.6249-09.2010

Cited by 139 publications

(213 citation statements)

References 50 publications

Supporting

Mentioning

205

Contrasting

Unclassified

Order By: Relevance

“…If reinforcement signals (e.g., dopamine) were mediated via causal inverse models instead of being released nonspecifically as assumed in many computational models (46,47), then motor learning could be more efficient, in accordance with model-based reinforcement strategies (48). Indeed, simple reinforcement learning strategies can be enhanced with inverse models as a means to solve the structural credit assignment problem inherent in reinforcement learning (49).…”

Section: Discussionmentioning

confidence: 99%

Evidence for a causal inverse model in an avian cortico-basal ganglia circuit

Giret

Kornfeld²,

Ganguli

et al. 2014

Proc. Natl. Acad. Sci. U.S.A.

View full text Add to dashboard Cite

Learning by imitation is fundamental to both communication and social behavior and requires the conversion of complex, nonlinear sensory codes for perception into similarly complex motor codes for generating action. To understand the neural substrates underlying this conversion, we study sensorimotor transformations in songbird cortical output neurons of a basal-ganglia pathway involved in song learning. Despite the complexity of sensory and motor codes, we find a simple, temporally specific, causal correspondence between them. Sensory neural responses to song playback mirror motor-related activity recorded during singing, with a temporal offset of roughly 40 ms, in agreement with short feedback loop delays estimated using electrical and auditory stimulation. Such matching of mirroring offsets and loop delays is consistent with a recent Hebbian theory of motor learning and suggests that cortico-basal ganglia pathways could support motor control via causal inverse models that can invert the rich correspondence between motor exploration and sensory feedback.lateral magnocellular nucleus of the anterior nidopallium | Hebbian learning | mirror neuron

show abstract

Section: Discussionmentioning

confidence: 99%

Evidence for a causal inverse model in an avian cortico-basal ganglia circuit

Giret

Kornfeld²,

Ganguli

et al. 2014

Proc. Natl. Acad. Sci. U.S.A.

View full text Add to dashboard Cite

show abstract

“…При STDP вес синапса возрастает, если входной спайк пришел до выходного, и убывает, если входной спайк пришел после выходного. Экспоненци-альное забывание сенсорной истории при применении правила минимизации энтропии (4.1) приводит к модуляции изменений весов (3.3) значениями сигнала подкрепления [11] -яв-лению Modulated STDP, также наблюдаемому в биологических нейронах и исследованному в работах [37][38][39].…”

Section: модулированное снижение информационной энтропииunclassified

Reinforcement learning of a spiking neural network in the task of control of an agent in a virtual discrete environment

Sinyavskiy¹,

Kobrin²

2011

Nelin. Dinam.

View full text Add to dashboard Cite

“…As opposed to the algorithms proposed by Pfeiffer et al (2010), Legenstein et al (2010) and Frémaux et al (2010), the present work neither devises a learning rule for optimal weight tuning nor proposes a new reinforcement learning algorithm. In fact, while reinforcement learning by means of modulated spike-timing dependent plasticity (STDP) was demonstrated in Soula et al (2005), Florian (2007) and Frémaux et al (2010), the primary aim of this work is the exploitation of saturated weights and neural noise to achieve a simple bottom-up implementation of oper-ant reward learning.…”

Section: Introductionmentioning

confidence: 98%

“…The change in modulatory activity has in fact been suggested to regulate the alternation of exploration and exploitation in Krichmar (2008). Thus, while the dynamics of modulated Hebbian plasticity and modulated spike-timedependent plasticity (STDP) have been extensively investigated (Abbott, 1990;Montague et al, 1996;Florian, 2007;Porr and Wörgötter, 2007;Frémaux et al, 2010;Pfeiffer et al, 2010), the novelty of this work is their extension by means of saturation and noise, resulting in a simpler and more fundamental connection between local changes and higher-level simulated behavior. The fundamental properties of the new plasticity model are tested in behavioral tasks employing first a single-neuron model, and later extended to multi-neuron networks.…”

Section: Introductionmentioning

confidence: 99%

“…In fact, while reinforcement learning by means of modulated spike-timing dependent plasticity (STDP) was demonstrated in Soula et al (2005), Florian (2007) and Frémaux et al (2010), the primary aim of this work is the exploitation of saturated weights and neural noise to achieve a simple bottom-up implementation of oper-ant reward learning. Furthermore, in contrast to Pfeiffer et al (2010), the current algorithm does not require a decay function, input signal preprocessing nor winnertake-all action selection.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

From modulated Hebbian plasticity to simple behavior learning through noise and weight saturation

Soltoggio¹,

Stanley²

2012

Neural Networks

View full text Add to dashboard Cite

Citation: SOLTOGGIO, A. and STANLEY, K.O., 2012. From modulated Hebbian plasticity to simple behavior learning through noise and weight saturation. Neural Networks, 34 pp. 28-41.Additional Information:• NOTICE: this is the author's version of a work that was accepted for publication in Neural Networks. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Neural Networks, vol. 34 (2012) AbstractSynaptic plasticity is a major mechanism for adaptation, learning and memory. Yet current models struggle to link local synaptic changes to the acquisition of behaviors. The aim of this paper is to demonstrate a computational relationship between local Hebbian plasticity and behavior learning by exploiting two traditionally unwanted features: neural noise and synaptic weight saturation. A modulation signal is employed to arbitrate the sign of plasticity: when the modulation is positive, the synaptic weights saturate to express exploitative behavior; when it is negative, the weights converge to average values and neural noise reconfigures the network's functionality. This process is demonstrated through simulating neural dynamics in the autonomous emergence of fearful and aggressive navigating behaviors and in the solution to reward-based problems. The neural model learns, memorizes and modifies different behaviors that lead to positive modulation in a variety of settings. The algorithm establishes a simple relationship between local plasticity and behavior learning by demonstrating the utility of noise and weight saturation. Moreover it provides a new tool to simulate adaptive behavior and contributes to bridging the gap between synaptic changes and behavior in neural computation.

show abstract

Functional Requirements for Reward-Modulated Spike-Timing-Dependent Plasticity

Cited by 139 publications

References 50 publications

Evidence for a causal inverse model in an avian cortico-basal ganglia circuit

Evidence for a causal inverse model in an avian cortico-basal ganglia circuit

Reinforcement learning of a spiking neural network in the task of control of an agent in a virtual discrete environment

From modulated Hebbian plasticity to simple behavior learning through noise and weight saturation

Contact Info

Product

Resources

About