2018
DOI: 10.1007/s10994-018-5730-4
|View full text |Cite
|
Sign up to set email alerts
|

Inverse reinforcement learning from summary data

Abstract: Inverse reinforcement learning (IRL) aims to explain observed strategic behavior by fitting reinforcement learning models to behavioral data. However, traditional IRL methods are only applicable when the observations are in the form of state-action paths. This assumption may not hold in many real-world modeling settings, where only partial or summarized observations are available. In general, we may assume that there is a summarizing function σ, which acts as a filter between us and the true state-action paths… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
11
0

Year Published

2019
2019
2023
2023

Publication Types

Select...
5
3
1

Relationship

0
9

Authors

Journals

citations
Cited by 12 publications
(11 citation statements)
references
References 29 publications
0
11
0
Order By: Relevance
“…It can also be seen as a variant of Maximum Causal Entropy Inverse Reinforcement Learning (Ziebart et al, 2010): while inverse reinforcement learning (IRL) requires demonstrations, or at least state sequences without actions Yu et al, 2018), we learn a reward function from a single state, albeit with the simplifying assumption of known dynamics. This can also be seen as an instance of IRL from summary data (Kangasrääsiö and Kaski, 2018).…”
Section: Related Workmentioning
confidence: 99%
“…It can also be seen as a variant of Maximum Causal Entropy Inverse Reinforcement Learning (Ziebart et al, 2010): while inverse reinforcement learning (IRL) requires demonstrations, or at least state sequences without actions Yu et al, 2018), we learn a reward function from a single state, albeit with the simplifying assumption of known dynamics. This can also be seen as an instance of IRL from summary data (Kangasrääsiö and Kaski, 2018).…”
Section: Related Workmentioning
confidence: 99%
“…expand this work to the situation where the agent's stochastic policy mapping is also unknown and needs to be estimated. Kangasrääsiö and Kaski (2018) present a model-free IRL method where only partial expert trajectories are available. The authors present three methods to estimate the rewards under this assumption.…”
Section: Model-free Irlmentioning
confidence: 99%
“…is framework not only outperforms the state-of-the-art with a large margin but also reveals valuable facts on learning face representation. Literature [9][10][11] proposes the reinforcement learning model, which matches behavioral data to explain the observed strategic behavior. e results show that this method is more effective than similar methods.…”
Section: Related Workmentioning
confidence: 99%