Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Confer 2021
DOI: 10.18653/v1/2021.acl-long.40
|View full text |Cite
|
Sign up to set email alerts
|

Improving the Faithfulness of Attention-based Explanations with Task-specific Information for Text Classification

Abstract: Neural network architectures in natural language processing often use attention mechanisms to produce probability distributions over input token representations. Attention has empirically been demonstrated to improve performance in various tasks, while its weights have been extensively used as explanations for model predictions.Recent studies (Jain and Wallace, 2019; Serrano and Smith, 2019; Wiegreffe and Pinter, 2019) have showed that it cannot generally be considered as a faithful explanation (Jacovi and Gol… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
6
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
3
2
2

Relationship

2
5

Authors

Journals

citations
Cited by 20 publications
(12 citation statements)
references
References 38 publications
0
6
0
Order By: Relevance
“…As in the Weibo dataset, we also remove all users that both spread and debunk misinformation since we currently focus on the binary setting as in Giachanou et al [13] since we found that less than 10% of all users fall into this category (i.e., both spread and debunk misinformation) in the two datasets. 9 This process yielded 15,696 posters and 17,293 active citizens respectively. This is approximately 100 and 15 times larger than the datasets used in prior work [13,47].…”
Section: Twitter Datamentioning
confidence: 99%
See 3 more Smart Citations
“…As in the Weibo dataset, we also remove all users that both spread and debunk misinformation since we currently focus on the binary setting as in Giachanou et al [13] since we found that less than 10% of all users fall into this category (i.e., both spread and debunk misinformation) in the two datasets. 9 This process yielded 15,696 posters and 17,293 active citizens respectively. This is approximately 100 and 15 times larger than the datasets used in prior work [13,47].…”
Section: Twitter Datamentioning
confidence: 99%
“…For both datasets, we analyze the most important input tokens that contribute to the model prediction (i.e., HierERNIE LSTM in Weibo and HierLongformer LSTM in Twitter) by employing a widely used gradient-based explainability method i.e., the InputXGrad with L2 Norm Aggregation [25] that has been found to provide faithful explanations for transformer-based models in NLP tasks [9,10]. The InputXGrad (x∇x) ranks the input tokens by computing the derivative of the input with respect to the model predicted class and then multiplied by the input itself, where ∇𝑥 𝑖 = 𝜕 ŷ 𝜕𝑥 𝑖 .…”
Section: Model Explainabilitymentioning
confidence: 99%
See 2 more Smart Citations
“…Pruthi et al (2020) show similar outcomes by manipulating attention to attend to uninformative tokens. Pascual et al (2021) and Brunner et al (2019) argue that this might be due to significant information mixing in higher layers of the model, with recent studies showing improvements in the faithfulness of attention-based explanations by addressing this (Chrysostomou and Aletras, 2021;Tutek and Snajder, 2020). Atanasova et al (2020) evaluate faithfulness of explanations (Jacovi and Goldberg, 2020) by removing important tokens and observing differences in prediction, showing that generally gradient-based approaches for transformers produce more faithful explanations compared to sparse meta-models (Ribeiro et al, 2016).…”
Section: Related Workmentioning
confidence: 99%