2019 International Conference on Robotics and Automation (ICRA) 2019
DOI: 10.1109/icra.2019.8793650
|View full text |Cite
|
Sign up to set email alerts
|

Adaptive Variance for Changing Sparse-Reward Environments

Abstract: Robots that are trained to perform a task in a fixed environment often fail when facing unexpected changes to the environment due to a lack of exploration. We propose a principled way to adapt the policy for better exploration in changing sparse-reward environments. Unlike previous works which explicitly model environmental changes, we analyze the relationship between the value function and the optimal exploration for a Gaussian-parameterized policy and show that our theory leads to an effective strategy for a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

0
12
0

Year Published

2019
2019
2021
2021

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(12 citation statements)
references
References 28 publications
0
12
0
Order By: Relevance
“…Robots are now being utilized in outer space missions, medical surgery, meal delivery in hospitals [49], and so on. However, often robots need to adapt to non-stationary operating conditions-for example, a ground robot/rover must adapt its walking gait to changing terrain conditions [44] or friction coefficients of surface [41].…”
Section: Roboticsmentioning
confidence: 99%
See 4 more Smart Citations
“…Robots are now being utilized in outer space missions, medical surgery, meal delivery in hospitals [49], and so on. However, often robots need to adapt to non-stationary operating conditions-for example, a ground robot/rover must adapt its walking gait to changing terrain conditions [44] or friction coefficients of surface [41].…”
Section: Roboticsmentioning
confidence: 99%
“…Robotic environments characterized by changing conditions and sparse rewards are particularly hard to learn, because, often, the reinforcement to the RL agent is a small value and is also obtained at the end of the task. Reference [41] focuses on learning in robotic arms where object manipulation is characterized by sparse-reward environments. The robotic arm is tasked with moving or manipulating objects that are placed at fixed positions on a table.…”
Section: Roboticsmentioning
confidence: 99%
See 3 more Smart Citations