2008
DOI: 10.1016/j.artint.2007.09.009
|View full text |Cite
|
Sign up to set email alerts
|

Teachable robots: Understanding human teaching behavior to build more effective robot learners

Abstract: This article was published in an Elsevier journal. The attached copy is furnished to the author for non-commercial research and education use, including for instruction at the author's institution, sharing with colleagues and providing to institution administration. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Wor… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

4
247
1
4

Year Published

2009
2009
2022
2022

Publication Types

Select...
4
2
1

Relationship

3
4

Authors

Journals

citations
Cited by 316 publications
(256 citation statements)
references
References 23 publications
4
247
1
4
Order By: Relevance
“…The human's feedback is the only source of feedback or evaluation that the agent receives. However, TAMER and other methods for learning from human reward can be useful even when other evaluative information is available, as has been shown previously [21,5,17,11]. The TAMER algorithm described below has additionally been extended to learn in continuous action spaces through an actor-critic algorithm [22] and to provide additional information to the trainer-either action confidence or summaries of past performance-creating changes in the quantity of reward instances given and in learned performance [14] Motivation and philosophy of TAMER The TAMER framework is designed around two insights.…”
Section: Background On Tamermentioning
confidence: 99%
See 2 more Smart Citations
“…The human's feedback is the only source of feedback or evaluation that the agent receives. However, TAMER and other methods for learning from human reward can be useful even when other evaluative information is available, as has been shown previously [21,5,17,11]. The TAMER algorithm described below has additionally been extended to learn in continuous action spaces through an actor-critic algorithm [22] and to provide additional information to the trainer-either action confidence or summaries of past performance-creating changes in the quantity of reward instances given and in learned performance [14] Motivation and philosophy of TAMER The TAMER framework is designed around two insights.…”
Section: Background On Tamermentioning
confidence: 99%
“…Accordingly, other algorithms for learning from human reward [4,21,20,16,18,13] do not directly account for delay, do not model human reward explicitly, and are not fully myopic (i.e., they employ discount factors greater than 0).…”
Section: Background On Tamermentioning
confidence: 99%
See 1 more Smart Citation
“…Our HRI studies with an interactive RL agent revealed that people use the reward signal not only to provide feedback on past actions (what is commonly assumed in the design of RL algorithms) but also to guide future action (Thomaz & Breazeal 2008). Further, we discovered a strong bias of positive over negative feedback over the entire duration of the training, even in the beginning when the agent was doing many things wrong (Thomaz & Breazeal 2008). This suggests that people were using the feedback channel to motivate and encourage the robot.…”
Section: (D) Challengementioning
confidence: 99%
“…In a series of human participant studies where human teachers guide a robot to perform a simple task (learning to operate a control panel with a lever, toggle and button), we have found that humans readily coordinate their teaching behaviour with the robot's gaze behaviour-waiting until the robot re-establishes eye contact before offering their next guidance cue, adaptively re-orienting their guidance cue to be in alignment with the robot's current visual focus, actively trying to re-direct the robot's gaze through deictic cues or offering more guidance if the robot's gaze behaviour conveys uncertainty in what to do next (e.g. looking back and forth among several possible alternatives) (Breazeal & Thomaz 2008a;Thomaz & Breazeal 2008). These findings suggest that people read the robot's gaze as an indicator of its internal state of attention as well as solicitations for help, and intuitively coordinate their teaching acts to support the robot's learning process.…”
Section: Expression In Social Robots C Breazeal 3529mentioning
confidence: 99%