Findings of the Association for Computational Linguistics: EMNLP 2020 2020
DOI: 10.18653/v1/2020.findings-emnlp.341
|View full text |Cite
|
Sign up to set email alerts
|

Reevaluating Adversarial Examples in Natural Language

Abstract: State-of-the-art attacks on NLP models lack a shared definition of what constitutes a successful attack. These differences make the attacks difficult to compare and hindered the use of adversarial examples to understand and improve NLP models. We distill ideas from past work into a unified framework: a successful natural language adversarial example is a perturbation that fools the model and follows four proposed linguistic constraints. We categorize previous attacks based on these constraints. For each constr… Show more

Help me understand this report
View preprint versions

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

8
72
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
5
3

Relationship

1
7

Authors

Journals

citations
Cited by 60 publications
(80 citation statements)
references
References 28 publications
8
72
0
Order By: Relevance
“…A study on algorithms to generate adversarial examples found that they may not fully preserve semantics and may introduce up to 38% of grammatical errors (Morris et al, 2020a). To mitigate this, through comparison with human performance, its authors suggest a number of TextFooler's hyperparameters.…”
Section: Methodsmentioning
confidence: 99%
“…A study on algorithms to generate adversarial examples found that they may not fully preserve semantics and may introduce up to 38% of grammatical errors (Morris et al, 2020a). To mitigate this, through comparison with human performance, its authors suggest a number of TextFooler's hyperparameters.…”
Section: Methodsmentioning
confidence: 99%
“…With a looser definition, the search space includes more candidate adversarial examples. The more candidates there are, the more likely the search is to find an example that fools the victim model -thereby achieving a higher attack success rate (Morris et al, 2020b).…”
Section: Elements Of a Search Processmentioning
confidence: 99%
“…Constraints: Morris et al (2020b) proposed a set of linguistic constraints to enforce that x and perturbed x 0 should be similar in both meaning and fluency to make x 0 a valid potential adversarial example. This indicates that the search space should ensure x and x 0 are close in semantic embedding space.…”
Section: Defining Search Spacesmentioning
confidence: 99%
See 1 more Smart Citation
“…However, recent work has shown that their performance can be easily undermined with adversarial examples that would pose no confusion for humans . As an increasing number of successful adversarial attackers have been developed for NLP tasks, the quality of the adversarial examples they generate has been questioned (Morris et al, 2020).…”
Section: Introductionmentioning
confidence: 99%