2020
DOI: 10.1609/aaai.v34i03.5631
|View full text |Cite
|
Sign up to set email alerts
|

Explainable Reinforcement Learning through a Causal Lens

Abstract: Prominent theories in cognitive science propose that humans understand and represent the knowledge of the world through causal relationships. In making sense of the world, we build causal models in our mind to encode cause-effect relations of events and use these to explain why new events happen by referring to counterfactuals — things that did not happen. In this paper, we use causal models to derive causal explanations of the behaviour of model-free reinforcement learning agents. We present an approach that … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

3
167
1

Year Published

2020
2020
2024
2024

Publication Types

Select...
5
2
1

Relationship

1
7

Authors

Journals

citations
Cited by 201 publications
(171 citation statements)
references
References 10 publications
3
167
1
Order By: Relevance
“…Recent work by Madumal et al ( 2020 ), implemented explanations in a RL agent playing StarCraft II , under the premise that humans would prefer causal models of explanation. The agent was able to answer “counterfactual” levels of explanations, i.e., “why” questions.…”
Section: Discussionmentioning
confidence: 99%
See 2 more Smart Citations
“…Recent work by Madumal et al ( 2020 ), implemented explanations in a RL agent playing StarCraft II , under the premise that humans would prefer causal models of explanation. The agent was able to answer “counterfactual” levels of explanations, i.e., “why” questions.…”
Section: Discussionmentioning
confidence: 99%
“…These steps integrated with the code annotations previously described for this system. The work by Madumal et al (2020) featured a query-based RL agent playing Starcraft II. The agent focused on handling why?…”
Section: Query-based Explanationsmentioning
confidence: 99%
See 1 more Smart Citation
“…The approach in [74] also leverages temporal abstractions in the form of intermediate subgoals to illustrate why possible foils fail. Use of abstractions is, of course, not confined to explanations of unsolvability: recent work [50] used abstract models defined over simpler user-defined features to generate explanations for reinforcement learning problems in terms of action influence. The method discussed in [39] also allows for causal link explanations for abstract tasks, such as in HTN planning [63].…”
Section: Inference Reconciliationmentioning
confidence: 99%
“…Then an action influence graph is used to generate casual explanations from the fitted decision tree nodes. The action influence graph is constructed by using the influence of actions on variables (features of the state) [36].…”
Section: D: Causal Decision Trees (Cdt)mentioning
confidence: 99%