2022
DOI: 10.1162/tacl_a_00465
|View full text |Cite
|
Sign up to set email alerts
|

Evaluating Explanations: How Much Do Explanations from the Teacher Aid Students?

Abstract: While many methods purport to explain predictions by highlighting salient features, what aims these explanations serve and how they ought to be evaluated often go unstated. In this work, we introduce a framework to quantify the value of explanations via the accuracy gains that they confer on a student model trained to simulate a teacher model. Crucially, the explanations are available to the student during training, but are not available at test time. Compared with prior proposals, our approach is less easily … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
2

Citation Types

1
28
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
4
3

Relationship

1
6

Authors

Journals

citations
Cited by 27 publications
(29 citation statements)
references
References 18 publications
1
28
0
Order By: Relevance
“…Comparative studies We found almost no works that empirically compare approaches for learning from explanations across integration methods or explanation types. Pruthi et al (2022) compare MTL and REGULARIZATION methods for learning from HIGHLIGHT explanations. They find that the former method requires more training examples and slightly underperforms regularization.…”
Section: Integrating Explanation Informationmentioning
confidence: 99%
See 1 more Smart Citation
“…Comparative studies We found almost no works that empirically compare approaches for learning from explanations across integration methods or explanation types. Pruthi et al (2022) compare MTL and REGULARIZATION methods for learning from HIGHLIGHT explanations. They find that the former method requires more training examples and slightly underperforms regularization.…”
Section: Integrating Explanation Informationmentioning
confidence: 99%
“…Selecting informative explanations Based on experiments with an artificial dataset, Hase and Bansal (2021) conclude that a model can be improved based on explanations if it can infer relevant latent information better from input instance and explanation combined, than from the input instance alone. This property could be quantified according to the metric suggested by Pruthi et al (2022), who quantify explanation quality as the performance difference between a model trained on input instances and trained with additional explanation annotations. Carton et al (2021) find that models can profit from those highlight explanations which lead to accurate model predictions if presented to the model in isolation.…”
Section: Information Contentmentioning
confidence: 99%
“…In contrast, recent works have focused on more quantitative criteria: correlation between explainability methods for measuring consistency Wallace, 2019, Serrano andSmith, 2019], sufficiency and comprehensiveness [DeYoung et al, 2020], and simulability: whether a human or machine consumer of explanations understands the model behavior well enough to predict its output on unseen examples [Doshi-Velez and Kim, 2017]. Simulability, in particular, has a number of desirable properties, such as being intuitively aligned with the goal of communicating the underlying model behavior to humans and being measurable in manual and automated experiments [Treviso and Martins, 2020, Hase and Bansal, 2020, Pruthi et al, 2020. Figure 1: Illustration of our SMaT framework.…”
Section: Introductionmentioning
confidence: 99%
“…For instance, Pruthi et al [2020] proposed a framework for automatic evaluation of simulability that, given a teacher model and explanations of this model's predictions, trains a student model to match the teacher's predictions. The explanations are then evaluated with respect to how well they help a student learn to simulate the teacher ( § 2).…”
Section: Introductionmentioning
confidence: 99%
See 1 more Smart Citation