2023
DOI: 10.48550/arxiv.2302.01928
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Aligning Robot and Human Representations

Abstract: To act in the world, robots rely on a representation of salient task aspects: for example, to carry a cup of coffee, a robot must consider movement efficiency and cup orientation in its behaviour. However, if we want robots to act for and with people, their representations must not be just functional but also reflective of what humans care about, i.e. their representations must be aligned with humans'. In this survey, we pose that current reward and imitation learning approaches suffer from representation misa… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
3
0

Year Published

2024
2024
2024
2024

Publication Types

Select...
2

Relationship

1
1

Authors

Journals

citations
Cited by 2 publications
(3 citation statements)
references
References 109 publications
0
3
0
Order By: Relevance
“…There are also works that actively learn from human teachers, where the emphasis is on generating actions or queries that are maximally informative for the human to label [6,12]. Unfortunately, these approaches all are limited by the fact that the feedback asked of the human is overfit to specific failures or desired data points, and rarely scale well relative to human time or effort [7].…”
Section: Related Workmentioning
confidence: 99%
“…There are also works that actively learn from human teachers, where the emphasis is on generating actions or queries that are maximally informative for the human to label [6,12]. Unfortunately, these approaches all are limited by the fact that the feedback asked of the human is overfit to specific failures or desired data points, and rarely scale well relative to human time or effort [7].…”
Section: Related Workmentioning
confidence: 99%
“…Recent research has pointed out the potential misalignment issue between human values and robotic objectives. Studies to address this issue include bidirectional human-robot communication in group settings [83], evaluation of task accomplishment [6], and disentangled representation learning (DRL) [77]. Particularly, Reinforcement Learning from Human Preference (RLHP) [1,14,49,50,84] emerges as a new trend to offer a flexible and adaptable way to fine-tune an agent's behavior based on human preference.…”
Section: Related Work 21 Human Preference Learning In Robot Manipulat...mentioning
confidence: 99%
“…However, these delicate reward functions may not accurately reflect humans' true values [18] due to generalization errors [57], task misspecifications [14], etc. The human-robot value alignment can not only improve robot performances according to humans' preference but also avoid undesired robot behavior and even safety issues [6,68,83]. Learning a reward model from human preferences hence emerges [18], which leverages the computationally efficient and user-friendly pairwise comparison to collect human preferences [8,14,38,39].…”
Section: Introductionmentioning
confidence: 99%