2017
DOI: 10.1101/129619
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Optimal structure of metaplasticity for adaptive learning

Abstract: Learning from reward feedback in a changing environment requires a high degree of adaptability, yet the precise estimation of reward information demands slow updates. We show that this tradeoff between adaptability and precision, which is present in standard reinforcementlearning models, can be substantially overcome via reward-dependent metaplasticity (rewarddependent synaptic changes that do not always alter synaptic efficacy). Metaplastic synapses achieve both adaptability and precision by forming two separ… Show more

Help me understand this report
View published versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
13
0

Year Published

2017
2017
2021
2021

Publication Types

Select...
5
1
1

Relationship

2
5

Authors

Journals

citations
Cited by 9 publications
(14 citation statements)
references
References 31 publications
1
13
0
Order By: Relevance
“…The APT sets an important constraint on learning reward values in a dynamic environment where they change over time. One solution to mitigate the APT is to adjust learning over time via metaplasticity 14 , 15 . Nevertheless, even with adjustable learning, the APT still persists and becomes more critical in multi-dimensional environments, since the learner may never receive reward feedback on many unchosen options and feedback on chosen options is limited.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…The APT sets an important constraint on learning reward values in a dynamic environment where they change over time. One solution to mitigate the APT is to adjust learning over time via metaplasticity 14 , 15 . Nevertheless, even with adjustable learning, the APT still persists and becomes more critical in multi-dimensional environments, since the learner may never receive reward feedback on many unchosen options and feedback on chosen options is limited.…”
Section: Discussionmentioning
confidence: 99%
“…This makes feature-based learning faster and more adaptable, without being noisier, than object-based learning. This is important because simply increasing the learning rates in object-based learning can improve adaptability but also adds noise in the estimation of reward values, which we refer to as the adaptability-precision tradeoff 14 , 15 . Therefore, the main advantage of heuristic feature-based learning might be to mitigate the adaptability-precision tradeoff.…”
Section: Introductionmentioning
confidence: 99%
“…update slowly after each feedback to be more accurate), which we refer to as the adaptability-precision tradeoff [1]. There are mechanisms to improve this tradeoff [2,3] and one such way is to increase the rate of learning after unexpected events and decrease it when the world is stable.…”
Section: Introductionmentioning
confidence: 99%
“…Our work builds directly on a rich line of theoretical and experimental work on the relationship between volatility and learning rates (Behrens et al, 2007;de Berker et al, 2016;Browning et al, 2015;Diaconescu et al, 2014;Farashahi et al, 2017;Iglesias et al, 2013;Khorsand and Soltani, 2017). There have been numerous reports of volatility effects on healthy and disordered behavioral and neural responses, often using a two-level manipulation of volatility like that from Figure 2 (Behrens et al, 2007;Brazil et al, 2017;Browning et al, 2015;Cole et al, 2020;Deserno et al, 2020;Diaconescu et al, 2020;Farashahi et al, 2017;Iglesias et al, 2013;Katthagen et al, 2018;Lawson et al, 2017;Paliwal et al, 2019;Piray et al, 2019;Powers et al, 2017;Pulcu and Browning, 2017;Soltani and Izquierdo, 2019).…”
Section: Discussionmentioning
confidence: 97%