2008
DOI: 10.1371/journal.pcbi.1000180
|View full text |Cite
|
Sign up to set email alerts
|

A Learning Theory for Reward-Modulated Spike-Timing-Dependent Plasticity with Application to Biofeedback

Abstract: Reward-modulated spike-timing-dependent plasticity (STDP) has recently emerged as a candidate for a learning rule that could explain how behaviorally relevant adaptive changes in complex networks of spiking neurons could be achieved in a self-organizing manner through local synaptic plasticity. However, the capabilities and limitations of this learning rule could so far only be tested through computer simulations. This article provides tools for an analytic treatment of reward-modulated STDP, which allows us t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

4
226
4
1

Year Published

2010
2010
2022
2022

Publication Types

Select...
6
4

Relationship

0
10

Authors

Journals

citations
Cited by 234 publications
(235 citation statements)
references
References 51 publications
4
226
4
1
Order By: Relevance
“…The abstract idea of this concept is that a complex network of calculating identities (e.g., neurons) is so diverse that each task is solved somewhere within the network (Maass et al 2002;Buonomano and Maass 2009;Maass 2010). However, one problem with this approach is the capacity, which depends sublinearly on the number of neurons (Ganguli et al 2008); another problem is the read-out of the task-specific information from the network (Maass et al 2007;Legenstein et al 2008). …”
Section: Physiological Mechanismmentioning
confidence: 99%
“…The abstract idea of this concept is that a complex network of calculating identities (e.g., neurons) is so diverse that each task is solved somewhere within the network (Maass et al 2002;Buonomano and Maass 2009;Maass 2010). However, one problem with this approach is the capacity, which depends sublinearly on the number of neurons (Ganguli et al 2008); another problem is the read-out of the task-specific information from the network (Maass et al 2007;Legenstein et al 2008). …”
Section: Physiological Mechanismmentioning
confidence: 99%
“…Thus DA does not further potentiate synaptic plasticity as modeled, but prevents depotentiation leading to a similar overall result of having specific potentiated synapses. The applicability of the reward-modulated model was further tested with computer simulations using networks of leaky integrate-and-fire (LIF) neurons by Legenstein et al (2008). The simulations showed the ability of this type of rule to predict spike times (rather than stimulus delivery times).…”
Section: Models Of the Effects Of Dopamine Releasementioning
confidence: 99%
“…When the LTD-part in the STDP window is suppressed and the remaining R-STDP is bias-corrected, the learning speed for standard association tasks comes close to the one for gradient-based spike reinforcement (FrĂ©maux et al, 2010). An elegant solution to solve the reward-bias problem is to assume that the internal reward signal is shaped by a temporal kernel that sums up to zero across time, R t dt = 0, and hence a positive internal reward signal must be followed or preceded by a negative one (Legenstein et al, 2008). What appears as a computational trick is reminiscent to the observed relieve from pain in fruit flies (Tanimoto et al, 2004), or the reward baseline adaptation in rodents (Schultz et al, 1997).…”
Section: Detailed Descriptionmentioning
confidence: 99%