Proceedings of the 55th Annual Meeting of the Association For Computational Linguistics (Volume 1: Long Papers) 2017
DOI: 10.18653/v1/p17-1106
|View full text |Cite
|
Sign up to set email alerts
|

Visualizing and Understanding Neural Machine Translation

Abstract: While neural machine translation (NMT) has made remarkable progress in recent years, it is hard to interpret its internal workings due to the continuous representations and non-linearity of neural networks. In this work, we propose to use layer-wise relevance propagation (LRP) to compute the contribution of each contextual word to arbitrary hidden states in the attention-based encoderdecoder framework. We show that visualization with LRP helps to interpret the internal workings of NMT and analyze translation e… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
2
1

Citation Types

1
126
0
1

Year Published

2018
2018
2020
2020

Publication Types

Select...
4
4
1

Relationship

0
9

Authors

Journals

citations
Cited by 143 publications
(128 citation statements)
references
References 15 publications
1
126
0
1
Order By: Relevance
“…Some approaches have only been evaluated using visual inspection (Ding et al, 2017;Li et al, 2016a). Goyal et al (2016) identified important words for a visual question answering system and informally evaluated their approach by analyzing the distribution among PoS tags (e.g., assuming that nouns are important).…”
Section: Related Workmentioning
confidence: 99%
“…Some approaches have only been evaluated using visual inspection (Ding et al, 2017;Li et al, 2016a). Goyal et al (2016) identified important words for a visual question answering system and informally evaluated their approach by analyzing the distribution among PoS tags (e.g., assuming that nouns are important).…”
Section: Related Workmentioning
confidence: 99%
“…To the best of our knowledge, Li et al (2016) presented the only work that directly employs saliency methods to interpret NLP models. Most similar to our work in spirit, Ding et al (2017) used Layer-wise Relevance Propagation (LRP; Bach et al 2015), an interpretation method resembling saliency, to interpret the internal working mechanisms of RNN-based neural machine translation systems. Although conceptually LRP is also a good fit for word alignment interpretation, we have some concerns with the mathematical soundness of LRP when applied to attention models.…”
Section: Related Workmentioning
confidence: 99%
“…A general method to determine input space relevances based on a backward decomposition of the neural network prediction function is layer-wise relevance propagation (LRP) (Bach et al, 2015). It was originally proposed to explain feed-forward neural networks such as convolutional neural networks (Bach et al, 2015;Lapuschkin et al, 2016), and was recently extended to recurrent neural networks (Arras et al, 2017b;Ding et al, 2017;Arjona-Medina et al, 2018).…”
Section: Layer-wise Relevance Propagationmentioning
confidence: 99%
“…Thus, methods that use additional information, such as training data statistics, sampling, or are optimization-based (Ribeiro et al, 2016;Lundberg and Lee, 2017;Chen et al, 2018) are out of our scope. Among the methods we consider, we note that the method of Murdoch et al (2018) was not yet compared against Arras et al (2017b); Ding et al (2017); and that the method of Ding et al (2017) was validated only visually. Moreover, to the best of our knowledge, no recurrent neural network explanation method was tested so far on a toy problem where the ground truth rele-vance value is known.…”
Section: Introductionmentioning
confidence: 99%