2020
DOI: 10.48550/arxiv.2002.01080
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Bridging the Gap: Providing Post-Hoc Symbolic Explanations for Sequential Decision-Making Problems with Inscrutable Representations

Abstract: As more and more complex AI systems are introduced into our day-to-day lives, it becomes important that everyday users can work and interact with such systems with relative ease. Orchestrating such interactions require the system to be capable of providing explanations and rationale for its decisions and be able to field queries about alternative decisions. A significant hurdle to allowing for such explanatory dialogue could be the mismatch between the complex representations that the systems use to reason abo… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

1
5
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
3
1

Relationship

1
3

Authors

Journals

citations
Cited by 4 publications
(6 citation statements)
references
References 3 publications
1
5
0
Order By: Relevance
“…Our results show, on a Gridworld task, PRIOR is better able to recover rewards that are negative everywhere and decrease further from the goal then the previous state of the PEBBLE [1]. Expanding upon our promising results, we found similar performance benefits of PRIOR over PEBBLE on a variant of Montezuma's Revenge Level 1 [4] and our future work includes extensive evaluation of PRIOR on more complex domains and preferences.…”
Section: Discussionsupporting
confidence: 57%
See 2 more Smart Citations
“…Our results show, on a Gridworld task, PRIOR is better able to recover rewards that are negative everywhere and decrease further from the goal then the previous state of the PEBBLE [1]. Expanding upon our promising results, we found similar performance benefits of PRIOR over PEBBLE on a variant of Montezuma's Revenge Level 1 [4] and our future work includes extensive evaluation of PRIOR on more complex domains and preferences.…”
Section: Discussionsupporting
confidence: 57%
“…Finally, we compute both priors over an abstract state representation interpretable to the human-teacher to further boost the agent's reward recovery performance. Humans use symbolic structures when selecting preferences [4], [5] and using the vocabulary of such a symbolic space improves reward attribution to states. The priors use the given symbolic vocabulary and are incorporated into reward function learn-ing as soft constraints [8] on the reward-learning objective as detailed in the next sections.…”
Section: Methodsmentioning
confidence: 99%
See 1 more Smart Citation
“…Such methods have been employed both within the context of sequential decision-making problems (cf. Sreedharan et al 2020) and single-shot decision-making problems (Kim et al 2018).…”
Section: Explanation Generationmentioning
confidence: 99%
“…The former corresponds to cases when the behavior (implicit update) or explanation (explicit update) is personalized to the human model and preferences. Examples include providing faithful and customized explanations of a black box classifier that accounts for the fidelity to the original model as well as user interest [30], explanations expressed in the human's vocabulary [50], or choosing explanation based on the human's preference to the type of information [62].…”
Section: Use Of M H Rmentioning
confidence: 99%