2016
DOI: 10.3389/frobt.2016.00008
|View full text |Cite
|
Sign up to set email alerts
|

Behavioral Diversity Generation in Autonomous Exploration through Reuse of Past Experience

Abstract: The production of behavioral diversity -producing a diversity of effects -is an essential strategy for robots exploring the world when facing situations where interaction possibilities are unknown or non-obvious. It allows to discover new aspects of the environment that cannot be inferred or deduced from available knowledge. However, creating behavioral diversity in situations where it is most crucial -new and unknown ones -is far from trivial. In particular in large and redundant sensorimotor spaces, only sma… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
9
0

Year Published

2016
2016
2022
2022

Publication Types

Select...
4
3
1

Relationship

1
7

Authors

Journals

citations
Cited by 20 publications
(9 citation statements)
references
References 46 publications
0
9
0
Order By: Relevance
“…The concepts introduced with BR-Evolution have also later been employed in the Novelty-based Evolutionary Babbling (Nov-EB) [27] that allows a robot to autonomously discover the possible interactions with objects in its environment. This work draws a first link between the QD-algorithms and the domain of developmental robotics, which is also studied in several other works (see [28] for overview).…”
Section: Gathering and Improving These Solutions Into Collectionsmentioning
confidence: 99%
“…The concepts introduced with BR-Evolution have also later been employed in the Novelty-based Evolutionary Babbling (Nov-EB) [27] that allows a robot to autonomously discover the possible interactions with objects in its environment. This work draws a first link between the QD-algorithms and the domain of developmental robotics, which is also studied in several other works (see [28] for overview).…”
Section: Gathering and Improving These Solutions Into Collectionsmentioning
confidence: 99%
“…Indeed, in many contexts, learning a single pre-defined skill can be difficult as it amounts to searching (the parameters of) a solution with very rare feedback until one is very close to the solution, or with deceptive feedback due to the phenomenon of local minima. A strategy to address these issues is to direct exploration with intrinsic rewards, leading the system to explore a diversity of skills and contingencies which often result in the discovery of new sub-spaces/areas in the problem space, or in mutual skill improvement when exploring one goal/skill provides data that can be used to improve other goals/skills, such as in goal babbling (Baranes and Oudeyer, 2013;Benureau and Oudeyer, 2016) or off-policy reinforcement learning (see the Horde architecture, Sutton et al, 2011). For example, Lehman and Stanley (2011) showed that searching for pure novelty in the behavioural space a robot to find a reward in a maze more efficiently than if it had been searching for behavioural parameters that optimized directly the reward.…”
Section: Intrinsically Motivated Exploration Scaffolds Efficient Multmentioning
confidence: 99%
“…At execution time, for a given goal τ , a loss function is defined over the parameterization space through L(θ) = C(τ, D(θ, c)). A black-box optimization algorithm, such as L-BFGS, is then used to optimize this function and find the optimal set of parameters θ (see [3,32,33] for examples of such meta-policy implementations in the IMGEP framework).…”
Section: Meta-policy Mechanismmentioning
confidence: 99%