Proceedings of the 7th ACM IKDD CoDS and 25th COMAD 2020
DOI: 10.1145/3371158.3371168
|View full text |Cite
|
Sign up to set email alerts
|

Deep Reinforcement Learning for Single-Shot Diagnosis and Adaptation in Damaged Robots

Abstract: Robotics has proved to be an indispensable tool in many industrial as well as social applications, such as warehouse automation, manufacturing, disaster robotics, etc. In most of these scenarios, damage to the agent while accomplishing mission-critical tasks can result in failure. To enable robotic adaptation in such situations, the agent needs to adopt policies which are robust to a diverse set of damages and must do so with minimum computational complexity. We thus propose a damage aware control architecture… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
2
0

Year Published

2020
2020
2022
2022

Publication Types

Select...
5

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(2 citation statements)
references
References 21 publications
0
2
0
Order By: Relevance
“…Nonetheless, the manual adjustment of the neural oscillators or the SNN is considered a disadvantage, due to being a time-consuming process. Therefore, some research published in the past five years studied the implementation of RL to self-learn how to generate locomotion based on the interactions of the robot with the environment [7,[60][61][62][63][64][65]. The advantage of this approach is not requiring previous knowledge about the robot or its surroundings, since it learns how to generate gaits through a trial-and-error process.…”
Section: Discussionmentioning
confidence: 99%
See 1 more Smart Citation
“…Nonetheless, the manual adjustment of the neural oscillators or the SNN is considered a disadvantage, due to being a time-consuming process. Therefore, some research published in the past five years studied the implementation of RL to self-learn how to generate locomotion based on the interactions of the robot with the environment [7,[60][61][62][63][64][65]. The advantage of this approach is not requiring previous knowledge about the robot or its surroundings, since it learns how to generate gaits through a trial-and-error process.…”
Section: Discussionmentioning
confidence: 99%
“…The reward function of the algorithm penalized high energy consumption but rewarded high heading velocity values. This method converged to a solution after 1400 episodes, and the robot could successfully adjust its locomotion to different surfaces with different values for the coefficient of friction; • Damage recovery: Verma et al [63] proposed a method based on the proximal policy optimization for damage recovery using a supervised learning NN for the selfdiagnosis of the damages. This algorithm could find a gait policy when the hexapod had one or two limbs injured.…”
mentioning
confidence: 99%