Proceedings of the 2020 ACM/IEEE International Conference on Human-Robot Interaction 2020
DOI: 10.1145/3319502.3374791
|View full text |Cite
|
Sign up to set email alerts
|

Joint Goal and Strategy Inference across Heterogeneous Demonstrators via Reward Network Distillation

Abstract: Reinforcement learning (RL) has achieved tremendous success as a general framework for learning how to make decisions. However, this success relies on the interactive hand-tuning of a reward function by RL experts. On the other hand, inverse reinforcement learning (IRL) seeks to learn a reward function from readily-obtained human demonstrations. Yet, IRL suffers from two major limitations: 1) reward ambiguitythere are an infinite number of possible reward functions that could explain an expert's demonstration … Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
15
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
4

Relationship

2
7

Authors

Journals

citations
Cited by 20 publications
(15 citation statements)
references
References 17 publications
0
15
0
Order By: Relevance
“…Recent advances in learning from demonstrations has introduced methods to capture heterogeneity from expert demonstrations. Prior work has showcased the ability to infer discrete modes of operation [28], multiple visual intentions [29], and most commonly diverse user preferences [30]- [33] from heterogeneous demonstrations. Existing approaches that embrace heterogeneity in demonstrations focus on inferring and encoding the distinct characteristics.…”
Section: Heterogeneous Learning From Demonstrationsmentioning
confidence: 99%
“…Recent advances in learning from demonstrations has introduced methods to capture heterogeneity from expert demonstrations. Prior work has showcased the ability to infer discrete modes of operation [28], multiple visual intentions [29], and most commonly diverse user preferences [30]- [33] from heterogeneous demonstrations. Existing approaches that embrace heterogeneity in demonstrations focus on inferring and encoding the distinct characteristics.…”
Section: Heterogeneous Learning From Demonstrationsmentioning
confidence: 99%
“…Unlike these works, we consider the problem of learning multiple rewards from a dataset containing several unlabeled behaviour intents [3]. We highlight that a related but distinct problem is that of Meta-IRL, where the goal is to learn a good meta-reward prior [14,39] or shared base reward [6] that can be used to learn new rewards efficiently.…”
Section: Related Workmentioning
confidence: 99%
“…Missions and games in FireCommander do not have unique solutions and each category of agents must learn both a local team-objective (e.g., either perform sensing or acting) as well as a global composite-objective (e.g., fight the fire and protect the facilities). Accordingly, the game objectives can be interpreted and interacted with in various different ways by a user and thus, FireCommander can be a perfect environment for developing learning from heterogeneous demonstrations algorithms [20,21,22,23,24], particularly for multi-agent coordination. 3.…”
Section: Stochastic and Probabilistic Environmentmentioning
confidence: 99%
“…• Multi-agent Learning from Heterogeneous Demonstrations (MA-LfHD)-FireCommander inherently includes two categories of heterogeneous robots: (1) Perception (e.g., Sensing) agents and ( 2) Action (e.g., Manipulator) agents, and is a multi-objective game. Missions and games in FireCommander do not have unique solutions and thus, can be interpreted and interacted with in various different ways by a user and thus, can be a perfect test-bed for developing learning from heterogeneous demonstrations algorithms [69,20,70,21,22,23,24,71].…”
Section: Multi-agent Learning From Demonstrations: Inverse Rlmentioning
confidence: 99%