2020 International SAUPEC/RobMech/PRASA Conference 2020
DOI: 10.1109/saupec/robmech/prasa48453.2020.9040988
|View full text |Cite
|
Sign up to set email alerts
|

Learning Options from Demonstration using Skill Segmentation

Abstract: We present a method for learning options from segmented demonstration trajectories. The trajectories are first segmented into skills using nonparametric Bayesian clustering and a reward function for each segment is then learned using inverse reinforcement learning. From this, a set of inferred trajectories for the demonstration are generated. Option initiation sets and termination conditions are learned from these trajectories using the one-class support vector machine clustering algorithm. We demonstrate our … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1

Citation Types

0
2
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
1
1

Relationship

0
2

Authors

Journals

citations
Cited by 2 publications
(2 citation statements)
references
References 14 publications
0
2
0
Order By: Relevance
“…Agents can utilize observations of exemplar behavior in order to learn aspects of the desired model. For instance, agents can learn actions or skills from observed imagery via the inclusion of a learned cost function [24,124] or by clustering observations into skills [22]. Other recent results have focused on analytical/theoretical aspects of the problem relating to reward function search or aspects of the behavior policy providing the demonstrations [10,50,79,100].…”
Section: Applications and Recent Resultsmentioning
confidence: 99%
“…Agents can utilize observations of exemplar behavior in order to learn aspects of the desired model. For instance, agents can learn actions or skills from observed imagery via the inclusion of a learned cost function [24,124] or by clustering observations into skills [22]. Other recent results have focused on analytical/theoretical aspects of the problem relating to reward function search or aspects of the behavior policy providing the demonstrations [10,50,79,100].…”
Section: Applications and Recent Resultsmentioning
confidence: 99%
“…For a skill to be obtained autonomously, the components of the triplet ⟨I, , ⟩ are needed. Therefore, there are many HRL studies that focus on detecting subgoals and/or defining initiation sets (Xu et al 2018;Simsek and Barreto 2008;Davoodabadi and Beigy 2011a, b;Shoeleh and Asadpour 2017;Kazemitabar et al 2018;Farahani and Mozayani 2019;Machado et al 2017;McGovern and Barto 2001;Stolle and Precup 2002;Şimşek and Barto 2004a;Ghafoorian et al 2013;Daniel et al 2016;Cockcroft et al 2020). Graph theoretic approaches are widely used for subgoal detection (Xu et al 2018;Simsek and Barreto 2008;Davoodabadi and Beigy 2011a, b;Shoeleh and Asadpour 2017;Kazemitabar et al 2018;Farahani and Mozayani 2019;Machado et al 2017).…”
Section: Related Workmentioning
confidence: 99%