Proceedings of the Fourth Conference on Machine Translation (Volume 1: Research Papers) 2019
DOI: 10.18653/v1/w19-5201
|View full text |Cite
|
Sign up to set email alerts
|

Saliency-driven Word Alignment Interpretation for Neural Machine Translation

Abstract: Despite their original goal to jointly learn to align and translate, Neural Machine Translation (NMT) models, especially Transformer, are often perceived as not learning interpretable word alignments. In this paper, we show that NMT models do learn interpretable word alignments, which could only be revealed with proper interpretation methods. We propose a series of such methods that are model-agnostic, are able to be applied either offline or online, and do not require parameter update or architectural change.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

3
70
0

Year Published

2019
2019
2024
2024

Publication Types

Select...
3
2
2

Relationship

0
7

Authors

Journals

citations
Cited by 56 publications
(73 citation statements)
references
References 23 publications
3
70
0
Order By: Relevance
“…Recently, there has been a debate on whether attention can be used to explain model decisions (Serrano and Smith, 2019;Jain and Wallace, 2019;Wiegreffe and Pinter, 2019), we thus present additional analysis of our proposed method based on saliency maps (Ding et al, 2019). Saliency maps have been shown to better capture word alignment than attention probabilities in neural machine translation.…”
Section: Discussionmentioning
confidence: 99%
“…Recently, there has been a debate on whether attention can be used to explain model decisions (Serrano and Smith, 2019;Jain and Wallace, 2019;Wiegreffe and Pinter, 2019), we thus present additional analysis of our proposed method based on saliency maps (Ding et al, 2019). Saliency maps have been shown to better capture word alignment than attention probabilities in neural machine translation.…”
Section: Discussionmentioning
confidence: 99%
“…Garg et al (2019) show that attention weights from the penultimate layer, i.e., l = L − 1, can induce the best alignments. Although simple to implement, this method fails to obtain satisfactory word alignments (Ding et al, 2019;Garg et al, 2019). First of all, instead of the relevance between y i and x j , W l i,j measures the relevance between decoder hidden state z l i and encoder output h j .…”
Section: Alignment By Attentionmentioning
confidence: 99%
“…Recently, there is a resurgence of interest in the community to study word alignments for the Transformer (Ding et al, 2019;Li et al, 2019). One simple solution is NAIVE-ATT, which induces word alignments from the attention weights between the encoder and decoder.…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations