2010
DOI: 10.1561/2300000012
|View full text |Cite
|
Sign up to set email alerts
|

Tactile Guidance for Policy Adaptation

Abstract: Demonstration learning is a powerful and practical technique to develop robot behaviors. Even so, development remains a challenge and possible demonstration limitations, for example correspondence issues between the robot and demonstrator, can degrade policy performance. This work presents an approach for policy improvement through a tactile interface located on the body of the robot. We introduce the Tactile Policy Correction (TPC) algorithm, that employs tactile feedback for the refinement of a demonstrated … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
11
0

Year Published

2011
2011
2016
2016

Publication Types

Select...
4
3

Relationship

2
5

Authors

Journals

citations
Cited by 17 publications
(11 citation statements)
references
References 22 publications
0
11
0
Order By: Relevance
“…For example, a human teacher might supervise the learning process, by modifying targets learned from demonstration [9] or resolving ambiguities in goal representations [10]. Datasets are iteratively built by providing new demonstrations in areas of low policy prediction confidence [40], [41], by providing explicit corrections on policy predictions to generate new data [40], [12] and by physically touching a robot during execution to provide kinesthetic corrections [11], [42], [13].…”
Section: Robot Learningmentioning
confidence: 99%
See 1 more Smart Citation
“…For example, a human teacher might supervise the learning process, by modifying targets learned from demonstration [9] or resolving ambiguities in goal representations [10]. Datasets are iteratively built by providing new demonstrations in areas of low policy prediction confidence [40], [41], by providing explicit corrections on policy predictions to generate new data [40], [12] and by physically touching a robot during execution to provide kinesthetic corrections [11], [42], [13].…”
Section: Robot Learningmentioning
confidence: 99%
“…Furthermore, our executions do not depend on time (unlike [11], [42], [13]), as our goal is not to execute a trajectory but rather to respond online to changes in contact with an object.…”
Section: Robot Learningmentioning
confidence: 99%
“…The Tactile Policy Correction (TPC) algorithm offers an approach for the adaptation of a demonstrated policy, using tactile feedback from a human teacher [3]. Corrections are provided in order to accomplish two goals (Fig.…”
Section: Algorithm Overviewmentioning
confidence: 99%
“…For our initial empirical validations of the TPC algorithm [3], the tactile correction interface consisted of Ergonomic Touchpads encircling the wrist of a manipulator arm, with validation on grasp positioning tasks. Comparisons to policies derived from solely teleoperation demonstration confirmed policy reuse to be an effective mechanism for transferring domain knowledge, and policy refinement to be more successful at improving performance.…”
Section: Which First Encodes Demonstrations In a Gaussian Mixture Modmentioning
confidence: 99%
“…Tactile feedback is used to assist in both policy refinement and the reuse of a demonstrated policy when developing a different policy; effectively using the demonstrated policy as prior knowledge for a new behavior. Empirical validation has included grasp positioning on the iCub humanoid [4], as well as grasp adaptation in response to changes in fingertip contact [13].…”
Section: A High-dof Humanoidmentioning
confidence: 99%