Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics 2020
DOI: 10.18653/v1/2020.acl-main.491
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating Explainable AI: Which Algorithmic Explanations Help Users Predict Model Behavior?

Abstract: Algorithmic approaches to interpreting machine learning models have proliferated in recent years. We carry out human subject tests that are the first of their kind to isolate the effect of algorithmic explanations on a key aspect of model interpretability, simulatability, while avoiding important confounding experimental factors. A model is simulatable when a person can predict its behavior on new inputs. Through two kinds of simulation tests involving text and tabular data, we evaluate five explanations metho… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
128
1

Year Published

2020
2020
2022
2022

Publication Types

Select...
4
3
3

Relationship

2
8

Authors

Journals

citations
Cited by 149 publications
(132 citation statements)
references
References 19 publications
3
128
1
Order By: Relevance
“…The first steps have been made in this direction in other research fields. 72,73 Nonetheless, further development of XAI applications in chemistry would greatly benefit from meaningful benchmarking, which will require close collaboration between medicinal chemists and computer scientists.…”
Section: Resultsmentioning
confidence: 99%
“…The first steps have been made in this direction in other research fields. 72,73 Nonetheless, further development of XAI applications in chemistry would greatly benefit from meaningful benchmarking, which will require close collaboration between medicinal chemists and computer scientists.…”
Section: Resultsmentioning
confidence: 99%
“…Model Interpretability: PROVER follows a significant body of previous work on developing interpretable neural models for NLP tasks to foster explainability. Several approaches have focused on formalizing the notion of interpretability (Rudin, 2019;Doshi-Velez and Kim, 2017;Hase and Bansal, 2020), tweaking features for local model interpretability (Ribeiro et al, 2016(Ribeiro et al, , 2018 and exploring interpretability in latent spaces (Joshi et al, 2018;Samangouei et al, 2018). Our work can be seen as generating explanations in the form of proofs for an NLP task.…”
Section: Related Workmentioning
confidence: 99%
“…LAS scores combine two key mechanisms: they measure simulatability, which reflects how well an observer can use model explanations to predict the model's output, while controlling for explanation leakage, which occurs when explanations directly leak the output. This metric is inspired by prior work on model interpretability (Doshi-Velez and Kim, 2017;Hase and Bansal, 2020), but to date no simulatability analysis has been carried out for NL explanations. We automate our evaluation by using a pretrained language model as the observer, serving as a proxy for a human.…”
Section: (Details In Section 3)mentioning
confidence: 99%