Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction 2018
DOI: 10.1145/3171221.3171267
|View full text |Cite
|
Sign up to set email alerts
|

Learning from Physical Human Corrections, One Feature at a Time

Abstract: We focus on learning robot objective functions from human guidance: specifically, from physical corrections provided by the person while the robot is acting. Objective functions are typically parametrized in terms of features, which capture aspects of the task that might be important. When the person intervenes to correct the robot's behavior, the robot should update its understanding of which features matter, how much, and in what way. Unfortunately, real users do not provide optimal corrections that isolate … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
103
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
4
2
1

Relationship

2
5

Authors

Journals

citations
Cited by 87 publications
(103 citation statements)
references
References 18 publications
0
103
0
Order By: Relevance
“…Replanning optimal trajectories. If we treat physical human interactions as observations about an underlying objective function, then the robot updates its objective after each interaction and replans an optimal trajectory with respect to this new objective [4,5,19]. Jain et al [19] present one such formulation, where the human physically corrects a robot waypoint offline, and the robot solves for its optimal trajectory at the next trial.…”
Section: Trajectory Updates From Physical Human Interactionmentioning
confidence: 99%
See 1 more Smart Citation
“…Replanning optimal trajectories. If we treat physical human interactions as observations about an underlying objective function, then the robot updates its objective after each interaction and replans an optimal trajectory with respect to this new objective [4,5,19]. Jain et al [19] present one such formulation, where the human physically corrects a robot waypoint offline, and the robot solves for its optimal trajectory at the next trial.…”
Section: Trajectory Updates From Physical Human Interactionmentioning
confidence: 99%
“…Jain et al [19] present one such formulation, where the human physically corrects a robot waypoint offline, and the robot solves for its optimal trajectory at the next trial. Although we have extended this method for real-time implementation [4,5], the approach is practically limited to only a few features and simple tasks, since optimal trajectory replanning for more complex settings cannot be performed in less than a second [21,39]. Accordingly, we treat trajectory replanning as an optimal offline solution and assess how much performance the robot loses using our real-time (i.e., sub-millisecond) approach.…”
Section: Trajectory Updates From Physical Human Interactionmentioning
confidence: 99%
“…With this in mind, this work avoids demonstrating a task itself but, instead, teaches the robot the involved primitive skills. This task factorisation provides similar benefits as the work in (Bajcsy et al 2018): it allows the user to teach one feature of the task at a time, and, if required, to correct them individually.…”
Section: Learning For a Dual-arm Manipulatormentioning
confidence: 98%
“…the issue of identifying a mapping between the teacher and the learner which allows transferring of information from one to the other (Dautenhahn and Nehaniv 2002). Moreover, complex motions involve a mixture of human intentions, which are difficult to accurately learn when following an all-at-once learning baseline (Bajcsy et al 2018). On top of that, teaching a dual-arm system can suppose a high endeavour for non-robotics-experts (Akgun et al 2012).…”
Section: Introductionmentioning
confidence: 99%
“…5) Incremental Learning from Demonstration (ILfD): ILfD refers to a special case where the demonstrations are provided during the learning process. ILfD has similar needs to LfD, but may require demonstrations that are responsive to problems identified by the user (e.g., [20]) or partial task demonstrations.…”
Section: ) Learning From Demonstration (Lfd)mentioning
confidence: 99%