Attention Weights in Transformer NMT Fail Aligning Words Between Sequences but Largely Explain Model Predictions

Ferrando, Javier; Costa-jussà, Marta R.

doi:10.18653/v1/2021.findings-emnlp.39

Cited by 9 publications

(12 citation statements)

References 17 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To compute the relative contribution of source and target tokens to each prediction made by the system we used an embedding perturbation method [41]. Given a source sentence x and its translation y, the absolute source contribution C S (y j ) when producing the probability of the j-th token y j is defined as the variance of y j 's output probability across N random perturbations of the word embeddings of x.…”

Section: Relative Source and Target Contributionsmentioning

confidence: 99%

“…Note that this approach involves replacing some words in targetlanguage sentences with other words randomly chosen from 19. We set N = 50 and λ = 0.01 [41]. For high-resource conditions, mean and standard deviation of the source influence obtained when translating in-domain test sets with the baseline system, four other DA reference systems, and MaTiLDA using different transformations and combinations of them.…”

Section: Relative Source and Target Contributionsmentioning

confidence: 99%

See 1 more Smart Citation

Non-Fluent Synthetic Target-Language Data Improve Neural Machine Translation

Sánchez-Cartagena,

Esplà-Gomis,

Pérez-Ortiz

et al. 2024

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

When the amount of parallel sentences available to train a neural machine translation is scarce, a common practice is to generate new synthetic training samples from them. A number of approaches have been proposed to produce synthetic parallel sentences that are similar to those in the parallel data available. These approaches work under the assumption that non-fluent target-side synthetic training samples can be harmful and may deteriorate translation performance. Even so, in this paper we demonstrate that synthetic training samples with non-fluent target sentences can improve translation performance if they are used in a multilingual machine translation framework as if they were sentences in another language. We conducted experiments on ten low-resource and four high-resource translation tasks and found out that this simple approach consistently improves translation performance as compared to state-of-the-art methods for generating synthetic training samples similar to those found in corpora. Furthermore, this improvement is independent of the size of the original training corpus, the resulting systems are much more robust against domain shift and produce less hallucinations.

show abstract

Section: Relative Source and Target Contributionsmentioning

confidence: 99%

Section: Relative Source and Target Contributionsmentioning

confidence: 99%

Non-Fluent Synthetic Target-Language Data Improve Neural Machine Translation

Sánchez-Cartagena,

Esplà-Gomis,

Pérez-Ortiz

et al. 2024

IEEE Trans. Pattern Anal. Mach. Intell.

View full text Add to dashboard Cite

show abstract

“…former computes gradients with respect to the input token embeddings to measure how much a change in the input changes the output. However, there is a tension between finding a faithful explanation and observing human-like alignments, since one does not imply the other (Ferrando and Costa-jussà, 2021).…”

Section: Introductionmentioning

confidence: 99%

“…Nonetheless, they apply their method on average over a dataset, not for getting input attributions of a single prediction. Gradientbased methods have also been extended to the target prefix (Ferrando and Costa-jussà, 2021), although they do not quantify the relative contribution of source and target inputs.…”

Section: Introductionmentioning

confidence: 99%

Towards Opening the Black Box of Neural Machine Translation: Source and Target Interpretations of the Transformer

Ferrando¹,

Gállego²,

Belen³

et al. 2022

Preprint

Self Cite

View full text Add to dashboard Cite

In Neural Machine Translation (NMT), each token prediction is conditioned on the source sentence and the target prefix (what has been previously translated at a decoding step). However, previous work on interpretability in NMT has focused solely on source sentence tokens attributions. Therefore, we lack a full understanding of the influences of every input token (source sentence and target prefix) in the model predictions. In this work, we propose an interpretability method that tracks complete input token attributions. Our method, which can be extended to any encoder-decoder Transformer-based model, allows us to better comprehend the inner workings of current NMT models. We apply the proposed method to both bilingual and multilingual Transformers and present insights into their behaviour.

show abstract

“…In ARE-based methods, which rely on an underlying classifier to predict if a post is toxic or not, the classifier is trained on the training part of the fold (which contains only toxic posts, ignoring the toxic span annotations) and a randomly selected but not in toxicity detection. See alsoWiegreffe and Pinter (2019),Kobayashi et al (2020), Ferrando andCosta-jussà (2021) for a broader discussion of attention as an explainability mechanism.…”

mentioning

confidence: 99%

From the Detection of Toxic Spans in Online Discussions to the Analysis of Toxic-to-Civil Transfer

Pavlopoulos¹,

Laugier²,

Xenos³

et al. 2022

Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

View full text Add to dashboard Cite

We study the task of toxic spans detection, which concerns the detection of the spans that make a text toxic, when detecting such spans is possible. We introduce a dataset for this task, TOXICSPANS, which we release publicly. By experimenting with several methods, we show that sequence labeling models perform best. Moreover, methods that add generic rationale extraction mechanisms on top of classifiers trained to predict if a post is toxic or not are also surprisingly promising. Finally, we use TOXICSPANS and systems trained on it, to provide further analysis of state-of-the-art toxic to non-toxic transfer systems, as well as of human performance on that latter task. Our work highlights challenges in finer toxicity detection and mitigation.

show abstract

Attention Weights in Transformer NMT Fail Aligning Words Between Sequences but Largely Explain Model Predictions

Cited by 9 publications

References 17 publications

Non-Fluent Synthetic Target-Language Data Improve Neural Machine Translation

Non-Fluent Synthetic Target-Language Data Improve Neural Machine Translation

Towards Opening the Black Box of Neural Machine Translation: Source and Target Interpretations of the Transformer

From the Detection of Toxic Spans in Online Discussions to the Analysis of Toxic-to-Civil Transfer

Contact Info

Product

Resources

About