Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) 2020
DOI: 10.18653/v1/2020.emnlp-main.495
|View full text |Cite
|
Sign up to set email alerts
|

T3: Tree-Autoencoder Constrained Adversarial Text Generation for Targeted Attack

Abstract: Adversarial attacks against natural language processing systems, which perform seemingly innocuous modifications to inputs, can induce arbitrary mistakes to the target models. Though raised great concerns, such adversarial attacks can be leveraged to estimate the robustness of NLP models. Compared with the adversarial example generation in continuous data domain (e.g., image), generating adversarial text that preserves the original meaning is challenging since the text space is discrete and non-differentiable.… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

1
49
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3
3

Relationship

0
9

Authors

Journals

citations
Cited by 50 publications
(50 citation statements)
references
References 32 publications
1
49
0
Order By: Relevance
“…Character-based models Ebrahimi et al, 2018;Gao et al, 2018, inter alia) use misspellings to attack the victim systems; however, these attacks can often be defended by a spell checker (Pruthi et al, 2019;Zhou et al, 2019b;Jones et al, 2020). Many sentence-level models (Iyyer et al, 2018;Wang et al, 2020;Zou et al, 2020, inter alia) have been developed to introduce more sophisticated token/phrase perturbations. These, however, generally have difficulty maintaining semantic similarity with original inputs (Zhang et al, 2020a).…”
Section: Adversarial Trainingmentioning
confidence: 99%
“…Character-based models Ebrahimi et al, 2018;Gao et al, 2018, inter alia) use misspellings to attack the victim systems; however, these attacks can often be defended by a spell checker (Pruthi et al, 2019;Zhou et al, 2019b;Jones et al, 2020). Many sentence-level models (Iyyer et al, 2018;Wang et al, 2020;Zou et al, 2020, inter alia) have been developed to introduce more sophisticated token/phrase perturbations. These, however, generally have difficulty maintaining semantic similarity with original inputs (Zhang et al, 2020a).…”
Section: Adversarial Trainingmentioning
confidence: 99%
“…Most existing attacks are word-level (Alzantot et al, 2018;Ren et al, 2019;Li et al, , 2020Jin et al, 2020;Zang et al, 2020b,a) or character-level (Hosseini et al, 2017;Ebrahimi et al, 2018;Belinkov and Bisk, 2018;Gao et al, 2018;Eger et al, 2019). Some studies present sentence-level attacks based on appending extra sentences (Jia and Liang, 2017;Wang et al, 2020a), perturbing sentence vectors or controlled text generation (Wang et al, 2020b). Iyyer et al (2018) propose to alter the syntax of original samples to generate adversarial examples, which is the most similar work to the style transferbased adversarial attack in this paper (although syntax and text style are distinct).…”
Section: Adversarial Attacks On Textmentioning
confidence: 99%
“…In the context of NLP, the initial research [22,23] started with the Stanford Question Answering Dataset (SQuAD) and further works extend to other NLP tasks, including classification [4,[7][8][9][10][11][24][25][26][27], text entailment [4,8,11], and machine translation [5,6,28]. Some of these works [10,24,29] adapt gradient-based methods from CV that need full access to the target model.…”
Section: Related Workmentioning
confidence: 99%
“…TextBugger [9] follows such a pattern, but explores a word-level perturbation strategy with the nearest synonyms in GloVe [30]. Later studies [4,8,25,27,31] of synonyms argue about choosing proper synonyms for substitution that do not cause misunderstandings for humans. Although these methods exhibit excellent performance in certain metrics (high success rate with limited perturbations), the efficiency is rarely discussed.…”
Section: Related Workmentioning
confidence: 99%