2010
DOI: 10.1177/1059712309359948
|View full text |Cite
|
Sign up to set email alerts
|

Modeling Behavior Cycles as a Value System for Developmental Robots

Abstract: The behavior of natural systems is governed by rhythmic behavior cycles at the biological, cognitive, and social levels. These cycles permit natural organisms to adapt their behavior to their environment for survival, behavioral efficiency, or evolutionary advantage. This article proposes a model of behavior cycles as the basis for motivated reinforcement learning in developmental robots. Motivated reinforcement learning is a machine learning technique that incorporates a value system with a trial-anderror lea… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2012
2012
2017
2017

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 12 publications
(7 citation statements)
references
References 24 publications
0
7
0
Order By: Relevance
“…3) Competence Merrick [19] proposes the following signal to motivate the acquisition of competence: (4) According to this motivation signal, highly motivating tasks must be repeatable and repetition must cause the agent to learn.…”
Section: ) Noveltymentioning
confidence: 99%
“…3) Competence Merrick [19] proposes the following signal to motivate the acquisition of competence: (4) According to this motivation signal, highly motivating tasks must be repeatable and repetition must cause the agent to learn.…”
Section: ) Noveltymentioning
confidence: 99%
“…Previous studies have investigated several principles related to the notion of value to devise increasingly flexible strategies of adaptation for mobile robotics: behavioral cycles to interact with the environment (Ahlgren & Halberg, 1990; McFarland & Spier, 1997), new algorithms to incorporate novelty (Huang & Weng, 2002), hierarchical RL architectures for skill learning (Baldassarre, 2002; G. Konidaris, Kuindersma, Barto, & Grupen, 2010), or the use of algorithms to select learning goals autonomously using a value system in a RL context (Merrick, 2010). The operation of each of these approaches is to some extent based on exploiting the interaction with the environment as a guide to structure behavioral responses.…”
Section: Discussionmentioning
confidence: 99%
“…This type of learning has been extensively used with success in many problems where the expected behavior is not known [2], [3]. These algorithms, however, suffers from the curse of dimensionality [4] where the search space grows exponentially as the number of states and actions increases [5].…”
Section: Introductionmentioning
confidence: 99%