2018 IEEE International Conference on Simulation, Modeling, and Programming for Autonomous Robots (SIMPAR) 2018
DOI: 10.1109/simpar.2018.8376271
|View full text |Cite
|
Sign up to set email alerts
|

Learning from outside the viability kernel: Why we should build robots that can fall with grace

Abstract: Despite impressive results using reinforcement learning to solve complex problems from scratch, in robotics this has still been largely limited to model-based learning with very informative reward functions. One of the major challenges is that the reward landscape often has large patches with no gradient, making it difficult to sample gradients effectively. We show here that the robot state-initialization can have a more important effect on the reward landscape than is generally expected. In particular, we sho… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2019
2019
2019
2019

Publication Types

Select...
1

Relationship

1
0

Authors

Journals

citations
Cited by 1 publication
(1 citation statement)
references
References 41 publications
(49 reference statements)
0
1
0
Order By: Relevance
“…Since all state-action pairs (s, a) ∈ Q N result in at least a second step, all s ∈ S N have at least a one failure-preventing action available. However, it is possible for a non-failing state-action pair to reach a state from which all solutions eventually reach a failed state, as was examined in [64]. In other words, there can be states from which immediate failure can be avoided, but from which the system will fail within some finite time.…”
Section: Viable Setsmentioning
confidence: 99%
“…Since all state-action pairs (s, a) ∈ Q N result in at least a second step, all s ∈ S N have at least a one failure-preventing action available. However, it is possible for a non-failing state-action pair to reach a state from which all solutions eventually reach a failed state, as was examined in [64]. In other words, there can be states from which immediate failure can be avoided, but from which the system will fail within some finite time.…”
Section: Viable Setsmentioning
confidence: 99%