2021
DOI: 10.48550/arxiv.2111.06956
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Human irrationality: both bad and good for reward inference

Abstract: Assuming humans are (approximately) rational enables robots to infer reward functions by observing human behavior. But people exhibit a wide array of irrationalities, and our goal with this work is to better understand the effect they can have on reward inference. The challenge with studying this effect is that there are many types of irrationality, with varying degrees of mathematical formalization. We thus operationalize irrationality in the language of MDPs, by altering the Bellman optimality equation, and … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

0
6
0

Year Published

2022
2022
2023
2023

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(6 citation statements)
references
References 27 publications
0
6
0
Order By: Relevance
“…Prospect theory, despite being highly influential in behav-ioral economics, has had a fairly muted impact in machine learning, with work concentrated in human-robot interaction (Kwon et al, 2020;Sun et al, 2019;Chan et al, 2021). Learning from sparse binary feedback is a staple of information retrieval and recommender systems (He et al, 2017;Koren et al, 2009), although to our knowledge it has not been used to generate open-ended text.…”
Section: Related Workmentioning
confidence: 99%
“…Prospect theory, despite being highly influential in behav-ioral economics, has had a fairly muted impact in machine learning, with work concentrated in human-robot interaction (Kwon et al, 2020;Sun et al, 2019;Chan et al, 2021). Learning from sparse binary feedback is a staple of information retrieval and recommender systems (He et al, 2017;Koren et al, 2009), although to our knowledge it has not been used to generate open-ended text.…”
Section: Related Workmentioning
confidence: 99%
“…We study several different models of human biases. Following (Chan, Critch, and Dragan 2021), we formalize each bias as a particular modification to the standard Bellman update, resulting in a modified value function which we use to determine the resulting policy and simulate choices from a biased human. We assume that the person is Boltzmann rational under their biased value function.…”
Section: Learning From Simulated Biased Feedbackmentioning
confidence: 99%
“…A subset of this literature has looked specifically into the feasibility and utility of learning both agents' reward functions and biases (Evans and Goodman 2015;Evans, Stuhlmueller, and Goodman 2016). Chan, Critch, and Dragan (2021) found that biases can make agents' behaviour more informative of their reward function, and that incorrectly modeling biases can result in poor reward inference. Armstrong and Mindermann (2018), however, show that jointly identifying biases and reward from observations is not always possible.…”
Section: Related Workmentioning
confidence: 99%