2021
DOI: 10.48550/arxiv.2103.04289
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Learning Human Rewards by Inferring Their Latent Intelligence Levels in Multi-Agent Games: A Theory-of-Mind Approach with Application to Driving Data

Abstract: Reward function, as an incentive representation that recognizes humans' agency and rationalizes humans' actions, is particularly appealing for modeling human behavior in human-robot interaction. Inverse Reinforcement Learning is an effective way to retrieve reward functions from demonstrations. However, it has always been challenging when applying it to multi-agent settings since the mutual influence between agents has to be appropriately modeled. To tackle this challenge, previous work either exploits equilib… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2

Citation Types

0
2
0

Year Published

2021
2021
2021
2021

Publication Types

Select...
1

Relationship

0
1

Authors

Journals

citations
Cited by 1 publication
(2 citation statements)
references
References 22 publications
0
2
0
Order By: Relevance
“…While Max-Ent MIRL is useful for addressing deliberate randomization in demonstrations, other methods may be able to capture specific biases and heuristics in more detail. [86] find that reasoning about an agent's intelligence level using Theory of Mind (ToM) can capture bounded intelligence and enhance flexibility of MIRL methods in multi-agent settings. ToM is an aspect of human social cognition, referring to our ability to explain and predict other's behaviour by attributing it to independent mental states, including beliefs and desires, supported by neuroscientific evidence [36].…”
Section: Maxent Mirl For Handling Biases and Heuristicsmentioning
confidence: 99%
See 1 more Smart Citation
“…While Max-Ent MIRL is useful for addressing deliberate randomization in demonstrations, other methods may be able to capture specific biases and heuristics in more detail. [86] find that reasoning about an agent's intelligence level using Theory of Mind (ToM) can capture bounded intelligence and enhance flexibility of MIRL methods in multi-agent settings. ToM is an aspect of human social cognition, referring to our ability to explain and predict other's behaviour by attributing it to independent mental states, including beliefs and desires, supported by neuroscientific evidence [36].…”
Section: Maxent Mirl For Handling Biases and Heuristicsmentioning
confidence: 99%
“…Both of the methods proposed by [86] and [88] intentionally do not decouple the agents from one another. Decoupling of reward function inference into agent-level subproblems is common when using a NE solution concept, which thus inherently assumes rational decision making by individual agents.…”
Section: Handling Specific Biases and Heuristicsmentioning
confidence: 99%