Explain, Edit, and Understand: Rethinking User Study Design for Evaluating Model Explanations

Arora, Siddhant; Pruthi, Danish; Sadeh, Norman; Cohen, William W.; Neubig, Graham

doi:10.48550/arxiv.2112.09669

Cited by 6 publications

(10 citation statements)

References 20 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The granularity provided in the scoring function may vary greatly, from a binary measure-important or not important-to a complete saliency map, depending on the tokenization granularity, the method and visualization. Most commonly, the explanation is given as a colorized saliency map over word tokens [e.g., 2,4,5,47,51]. Note that this work is not concerned with a particular feature-attribution method, but rather how feature-attribution explanations generally communicate information to human explainees, and what the explainees comprehend from them.…”

Section: Attribution Methodsmentioning

confidence: 99%

“…Feature-attribution explanations aim to convey which parts of the input to a model decision are "important", "responsible" or "influential" to the decision [3,7,36,42,60]. This class of explanation methods is a prevalent mode of describing NLP processes [10,31,36,47], due to two main strengths: (1) it is flexible and convenient, with many different measures developed which communicate some aspect of feature importance; (2) and it is intuitive, with-seemingly, as we discover-straightforward interfaces of relaying this information. Here we cover background on feature-attribution explanations on two fronts in alignment with these strengths: the underlying technologies (Section 2.1) and the information which they communicate to humans (Section 2.2).…”

Section: Feature-attribution Explanationsmentioning

confidence: 99%

“…What is the nature of any discrepancy in this perception? 2 As Miller [39] writes, literature in the social sciences about how humans comprehend explanations and behavior can help illuminate this problem.…”

Section: Social Attribution: the Case Of Text Markingmentioning

confidence: 99%

“…In natural language processing (NLP), this refers to which words, phrases or sentences in the input contributed most to the model prediction [10,36]. While much research exists on developing and verifying such explanations [1,4,32,35,50,51],less is known about the information that human explainees actually understand from them [2,12,19,39].In the explainable NLP literature, it is generally (implicitly) assumed that the explainee interprets the information "correctly", as it is communicated [4,17,20]: e.g., when one word is explained to be influential in the model's decision process, or more influential than another word, it is assumed that the explainee understands this relationship [28].We question this assumption: research in the social sciences describes modes in which the human explainee may be biased-via some cognitive habit-in their interpretation of processes [15,37,39,52]. Additional research shows this effect manifests in practice in AI settings [11,14,22,25,40].…”

mentioning

confidence: 99%

“…less is known about the information that human explainees actually understand from them [2,12,19,39].…”

mentioning

confidence: 99%

See 4 more Smart Citations

Human Interpretation of Saliency-based Explanation Over Text

Schuff,

Jacovi,

Adel

et al. 2022

Preprint

View full text Add to dashboard Cite

model decisions by specifying the parts of the input which are most salient in the model's decision process [6,18,48]. In natural language processing (NLP), this refers to which words, phrases or sentences in the input contributed most to the model prediction [10,36]. While much research exists on developing and verifying such explanations [1,4,32,35,50,51],less is known about the information that human explainees actually understand from them [2,12,19,39].In the explainable NLP literature, it is generally (implicitly) assumed that the explainee interprets the information "correctly", as it is communicated [4,17,20]: e.g., when one word is explained to be influential in the model's decision process, or more influential than another word, it is assumed that the explainee understands this relationship [28].We question this assumption: research in the social sciences describes modes in which the human explainee may be biased-via some cognitive habit-in their interpretation of processes [15,37,39,52]. Additional research shows this effect manifests in practice in AI settings [11,14,22,25,40]. This means, for example, that the explainee may underestimate the influence of a punctuation token, even if the explanation reports that this token is highly significant (Figure 1), because the explainee is attempting to understand how the model reasons by analogy to the explainee's own mind which is an instance of anthropomorphic bias [8,29,61] and belief bias [16,22].We identify three different such biases which may influence the explainee's interpretation: (i) anthropomorphic bias and belief bias: influence by the explainee's self projection onto the model; (ii) visual perception bias: influence by the explainee's visual affordances for comprehending information; (iii) learning effects: observable temporal changes in the explainee's interpretation as a result of interacting with the explanation over multiple instances.We thus address the following question in this paper: When a human explainee observes feature-attribution explanations, does their comprehended information differ from what the explanation "objectively" attempts to communicate? If so, how?We propose a methodology to investigate whether explainees exhibit biases when interpreting feature-attribution explanations in NLP, which effectively distort the objective attribution into a subjective interpretation of it (Section 4).We conduct user studies in which we show an input sentence and a feature-attribution explanation (i.e., saliency map) to explainees, ask them to report their subjective interpretation, and analyze their responses for statistical significance across multiple factors, such as word length, total input length, or dependency relation, using GAMMs (Section 5).We find that word length, sentence length, the position of the sentence in the temporal course of the experiment, the saliency rank, capitalization, dependency relation, word position, word frequency as well as sentiment can significantly affect user perception. In addition to whether a factor has...

show abstract

Section: Attribution Methodsmentioning

confidence: 99%

Section: Feature-attribution Explanationsmentioning

confidence: 99%

Section: Social Attribution: the Case Of Text Markingmentioning

confidence: 99%

mentioning

confidence: 99%

“…less is known about the information that human explainees actually understand from them [2,12,19,39].…”

mentioning

confidence: 99%

See 3 more Smart Citations

Human Interpretation of Saliency-based Explanation Over Text

Schuff,

Jacovi,

Adel

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

XAINES: Explaining AI with Narratives

Hartmann

Feldhus

et al. 2022

Künstl Intell

View full text Add to dashboard Cite

Artificial Intelligence (AI) systems are increasingly pervasive: Internet of Things, in-car intelligent devices, robots, and virtual assistants, and their large-scale adoption makes it necessary to explain their behaviour, for example to their users who are impacted by their decisions, or to their developers who need to ensure their functionality. This requires, on the one hand, to obtain an accurate representation of the chain of events that caused the system to behave in a certain way (e.g., to make a specific decision). On the other hand, this causal chain needs to be communicated to the users depending on their needs and expectations. In this phase of explanation delivery, allowing interaction between user and model has the potential to improve both model quality and user experience. The XAINES project investigates the explanation of AI systems through narratives targeted to the needs of a specific audience, focusing on two important aspects that are crucial for enabling successful explanation: generating and selecting appropriate explanation content, i.e. the information to be contained in the explanation, and delivering this information to the user in an appropriate way. In this article, we present the project’s roadmap towards enabling the explanation of AI with narratives.

show abstract

Evaluating Explanation Correctness in Legal Decision Making

Luo¹,

Bhambhoria²,

Dahan³

et al. 2022

Proceedings of the Canadian Conference on Artificial Intelligence

View full text Add to dashboard Cite

As machine learning models are being extensively deployed across many applications, concerns are rising with regard to their trustability. Explainable models have become an important topic of interest for high-stakes decision making, but their evaluation in the legal domain still remains seriously understudied; existing work does not have thorough feedback from subject matter experts to inform their evaluation. Our work here aims to quantify the faithfulness and plausibility of explainable AI methods over several legal tasks, using computational evaluation and user studies directly involving lawyers. The computational evaluation is for measuring faithfulness, how close the explanation is to the model's true reasoning, while the user studies are measuring plausibility, how reasonable is the explanation to a subject matter expert. The general goal of this evaluation is to find a more accurate indication of whether or not machine learning methods are able to adequately satisfy legal requirements.

show abstract

Explain, Edit, and Understand: Rethinking User Study Design for Evaluating Model Explanations

Cited by 6 publications

References 20 publications

Human Interpretation of Saliency-based Explanation Over Text

Human Interpretation of Saliency-based Explanation Over Text

XAINES: Explaining AI with Narratives

Evaluating Explanation Correctness in Legal Decision Making

Contact Info

Product

Resources

About