2021
DOI: 10.48550/arxiv.2112.09669
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Explain, Edit, and Understand: Rethinking User Study Design for Evaluating Model Explanations

Abstract: In attempts to "explain" predictions of machine learning models, researchers have proposed hundreds of techniques for attributing predictions to features that are deemed important. While these attributions are often claimed to hold the potential to improve human "understanding" of the models, surprisingly little work explicitly evaluates progress towards this aspiration. In this paper, we conduct a crowdsourcing study, where participants interact with deception detection models that have been trained to distin… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1

Citation Types

0
10
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
2
2
1

Relationship

0
5

Authors

Journals

citations
Cited by 6 publications
(10 citation statements)
references
References 20 publications
0
10
0
Order By: Relevance
“…The granularity provided in the scoring function may vary greatly, from a binary measure-important or not important-to a complete saliency map, depending on the tokenization granularity, the method and visualization. Most commonly, the explanation is given as a colorized saliency map over word tokens [e.g., 2,4,5,47,51]. Note that this work is not concerned with a particular feature-attribution method, but rather how feature-attribution explanations generally communicate information to human explainees, and what the explainees comprehend from them.…”
Section: Attribution Methodsmentioning
confidence: 99%
See 4 more Smart Citations
“…The granularity provided in the scoring function may vary greatly, from a binary measure-important or not important-to a complete saliency map, depending on the tokenization granularity, the method and visualization. Most commonly, the explanation is given as a colorized saliency map over word tokens [e.g., 2,4,5,47,51]. Note that this work is not concerned with a particular feature-attribution method, but rather how feature-attribution explanations generally communicate information to human explainees, and what the explainees comprehend from them.…”
Section: Attribution Methodsmentioning
confidence: 99%
“…Feature-attribution explanations aim to convey which parts of the input to a model decision are "important", "responsible" or "influential" to the decision [3,7,36,42,60]. This class of explanation methods is a prevalent mode of describing NLP processes [10,31,36,47], due to two main strengths: (1) it is flexible and convenient, with many different measures developed which communicate some aspect of feature importance; (2) and it is intuitive, with-seemingly, as we discover-straightforward interfaces of relaying this information. Here we cover background on feature-attribution explanations on two fronts in alignment with these strengths: the underlying technologies (Section 2.1) and the information which they communicate to humans (Section 2.2).…”
Section: Feature-attribution Explanationsmentioning
confidence: 99%
See 3 more Smart Citations