2021
DOI: 10.48550/arxiv.2110.15317
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Bridge the Gap Between CV and NLP! An Optimization-based Textual Adversarial Attack Framework

Abstract: Despite great success on many machine learning tasks, deep neural networks are still vulnerable to adversarial samples. While gradientbased adversarial attack methods are wellexplored in the field of computer vision, it is impractical to directly apply them in natural language processing due to the discrete nature of text. To bridge this gap, we propose a general framework to adapt existing gradientbased method to craft textual adversarial samples. In this framework, gradient-based continuous perturbations are… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

0
7
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(7 citation statements)
references
References 21 publications
0
7
0
Order By: Relevance
“…Text-based model robustness. Recent works find that ML models are generally not robust in different NLP tasks [34,36,46,63,68,70,72]. In response, a plethora of studies have endeavored to enhance their robustness, such as adversarial training [18,29,52,75] and certified robustness [16,20,23,45,54,64,67].…”
Section: Related Workmentioning
confidence: 99%
“…Text-based model robustness. Recent works find that ML models are generally not robust in different NLP tasks [34,36,46,63,68,70,72]. In response, a plethora of studies have endeavored to enhance their robustness, such as adversarial training [18,29,52,75] and certified robustness [16,20,23,45,54,64,67].…”
Section: Related Workmentioning
confidence: 99%
“…By manipulating the model gradients to induce misclassification behavior, these examples are generated, and have been shown to transfer across many different models [20]. Because AEs directly modify the input, they can not be directly generated with gradient manipulation on texts [33,39,12]. This is because nondifferential input text lookup in NLP models impedes the model gradient manipulation from reaching a legitimate word/words that could be used to generate AEs.…”
Section: Background and Related Workmentioning
confidence: 99%
“…Nevertheless, as noted earlier, gradients are still useful when they are applied using domain-specific constraints. For example, one can find local (wordlevel) perturbations that lead to a certain adversarial outcome, if the perturbations are restricted to welldefined semantic categories (e.g., "blue" can be perturbed to any other color name) (Sha, 2020;Guo et al, 2021;Yuan et al, 2021).…”
Section: Gpt Continuous Prompt D-pr Ojmentioning
confidence: 99%