2020
DOI: 10.1007/978-3-030-63000-3_1
|View full text |Cite
|
Sign up to set email alerts
|

Adaptive Coordination of Multiple Learning Strategies in Brains and Robots

Abstract: Engineering approaches to machine learning (including robot learning) typically seek for the best learning algorithm for a particular problem, or a set problems. In contrast, the mammalian brain appears as a toolbox of different learning strategies, so that any newly encountered situation can be autonomously learned by an animal with a combination of existing learning strategies. For example, when facing a new navigation problem, a rat can either learn a map of the environment and then plan to find a path to i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
3
0
2

Year Published

2021
2021
2024
2024

Publication Types

Select...
2
2

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(5 citation statements)
references
References 62 publications
0
3
0
2
Order By: Relevance
“…OFC dysfunction has been successfully modeled as an impairment in MB inferences resulting from disruption of the formation of latent states necessary for a detailed cognitive map of task space (Wilson et al, 2014). While this function effectively accounts for diverse experimental findings relating to the overall function of the OFC, we have recently found that dysfunction specific to the rodent lateral OFC causes a complex pattern of deficits in simple acquisition and extinction learning that is not clearly predicted by these RL theories (Panayi & Killcross, 2014, 2020. Here, we propose modifications to these RL models that can account for these findings.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…OFC dysfunction has been successfully modeled as an impairment in MB inferences resulting from disruption of the formation of latent states necessary for a detailed cognitive map of task space (Wilson et al, 2014). While this function effectively accounts for diverse experimental findings relating to the overall function of the OFC, we have recently found that dysfunction specific to the rodent lateral OFC causes a complex pattern of deficits in simple acquisition and extinction learning that is not clearly predicted by these RL theories (Panayi & Killcross, 2014, 2020. Here, we propose modifications to these RL models that can account for these findings.…”
Section: Discussionmentioning
confidence: 99%
“…This computational distinction also recently turned out useful in understanding how the balance between different learning strategies evolves through development (Decker et al, 2016), or how it varies between different individuals in Pavlovian conditioning paradigms, such as sign- versus goal-tracking behaviors (Cinotti, Marchand, et al, 2019; Lesaint et al, 2015). Finally, it is worth noting that this distinction is currently also a hot topic in machine learning and robotics (Khamassi, 2020; Kober et al, 2014; Wang et al, 2019), so that upcoming breakthroughs in these disciplines can later on fertilize computational neuroscience models of learning and decision-making.…”
Section: Reinforcement Learning Systemsmentioning
confidence: 99%
“…Il est à noter que cette stratégie sans modèle requière une longue phase d'apprentissage par essai-erreur et ne permet ensuite pas au robot de prédire les conséquences d'une action donnée. Mais l'intérêt réside dans le fait qu'une décision peut ensuite être rapide, évitant au robot d'effectuer les longs calculs requis s'il devait manipuler un modèle interne : le robot se contente de sélectionner le mouvement qui a la plus forte valeur à un moment donné de manière réactive en réponse à la reconnaissance de l'état de la tâche au moment de chaque décision (par exemple en reconnaissance d'un stimulus) (Khamassi, 2020a).…”
Section: Qu'est-ce Qu'un Modèle Interne ?unclassified
“…Une troisième propriété est intéressante à noter, car elle fait un lien avec le temps de délibération invoqué par Bergson : (3) un système sans modèle prend une décision très rapidement puisqu'il lui suffit de comparer un ensemble fini de valeurs (celles des actions en compétition dans la situation donnée) pour choisir l'action à la plus forte valeur (subjective). À l'opposé, un système avec modèle prend des décisions d'autant plus lentement qu'il lui faut estimer toutes les conséquences possibles d'une même action plusieurs coups à l'avance (par exemple lorsqu'on l'on joue aux échecs) et qu'il y a beaucoup d'actions possibles (Khamassi, 2020a).…”
Section: L'habitude Comme Comportement Décidé Sans Modèle (Et Sans Mé...unclassified
“…Studying the effects of imposing constraints on the memory capacity of episodic reinforcement learning models is something that previous approaches have not considered Ramani [2019], with the recent exception of Yalnizyan-Carson and Richards [2021]. The efficient use of a limited memory capacity is especially critical when embedding such algorithms in embodied systems such as robots that face strict computational and storage limitations Khamassi [2020] -a problem that human cognition also had to deal with Lisman and Idiart [1995], Jensen and Lisman [2001].…”
Section: Introductionmentioning
confidence: 99%