Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics 2019
DOI: 10.18653/v1/p19-1282
|View full text |Cite
|
Sign up to set email alerts
|

Is Attention Interpretable?

Abstract: Attention mechanisms have recently boosted performance on a range of NLP tasks. Because attention layers explicitly weight input components' representations, it is also often assumed that attention can be used to identify information that models found important (e.g., specific contextualized word tokens). We test whether that assumption holds by manipulating attention weights in already-trained text classification models and analyzing the resulting differences in their predictions. While we observe some ways i… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

16
385
1

Year Published

2020
2020
2023
2023

Publication Types

Select...
4
3
1
1

Relationship

0
9

Authors

Journals

citations
Cited by 481 publications
(402 citation statements)
references
References 36 publications
16
385
1
Order By: Relevance
“…Whether attention may or may not be considered as a mean to explain neural networks is currently an open debate. Some recent studies [10], [11] suggest that attention cannot be considered a reliable mean to explain or even interpret neural networks. Nonetheless, other works [6]- [9] advocate the use of attention weights as an analytic tool.…”
Section: A Attention For Deep Network Investigationmentioning
confidence: 99%
See 1 more Smart Citation
“…Whether attention may or may not be considered as a mean to explain neural networks is currently an open debate. Some recent studies [10], [11] suggest that attention cannot be considered a reliable mean to explain or even interpret neural networks. Nonetheless, other works [6]- [9] advocate the use of attention weights as an analytic tool.…”
Section: A Attention For Deep Network Investigationmentioning
confidence: 99%
“…It then becomes hard if not impossible to pinpoint the reasons behind the wrong output of a neural architecture. Interestingly, attention could provide a key to partially interpret and explain neural network behavior [5]- [9], even if it cannot be considered a reliable means of explanation [10], [11]. For instance, the weights computed by attention could point us to relevant information discarded by the neural network or to irrelevant elements of the input source that have been factored in and could explain a surprising output of the neural network.…”
Section: Introductionmentioning
confidence: 99%
“…Depending on the task and model architecture, attention may have more or less explanatory power for model predictions [35,51,57,71,79]. Visualization techniques have been used to convey the structure and properties of attention in Transformers [31,40,80,82].…”
Section: Interpreting Models In Nlpmentioning
confidence: 99%
“…This indicates evidence of super-human classification on imbalanced corpora. As deep learning models have often been criticized for their black-box nature, we suggest technical enhancements that focus on model interpretability as future work, such as through the use of rationales, 45 influence functions, 46 or sequence tagging approaches 47 that can offer deeper insights on the models and the reasons for their predictions. This is an area of active research.…”
Section: Resultsmentioning
confidence: 99%