Reevaluating Adversarial Examples in Natural Language

Morris, John X.; Lifland, Eli; Lanchantin, Jack; Ji, Yangfeng; Qi, Yanjun

doi:10.18653/v1/2020.findings-emnlp.341

Cited by 60 publications

(80 citation statements)

References 28 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…A study on algorithms to generate adversarial examples found that they may not fully preserve semantics and may introduce up to 38% of grammatical errors (Morris et al, 2020a). To mitigate this, through comparison with human performance, its authors suggest a number of TextFooler's hyperparameters.…”

Section: Methodsmentioning

confidence: 99%

Shortcutted Commonsense: Data Spuriousness in Deep Learning of Commonsense Reasoning

Branco¹,

Branco²,

Rodrigues³

et al. 2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

Commonsense is a quintessential human capacity that has been a core challenge to Artificial Intelligence since its inception. Impressive results in Natural Language Processing tasks, including in commonsense reasoning, have consistently been achieved with Transformer neural language models, even matching or surpassing human performance in some benchmarks. Recently, some of these advances have been called into question: so called data artifacts in the training data have been made evident as spurious correlations and shallow shortcuts that in some cases are leveraging these outstanding results.In this paper we seek to further pursue this analysis into the realm of commonsense related language processing tasks. We undertake a study on different prominent benchmarks that involve commonsense reasoning, along a number of key stress experiments, thus seeking to gain insight on whether the models are learning transferable generalizations intrinsic to the problem at stake or just taking advantage of incidental shortcuts in the data items.The results obtained indicate that most datasets experimented with are problematic, with models resorting to non-robust features and appearing not to be learning and generalizing towards the overall tasks intended to be conveyed or exemplified by the datasets.

show abstract

Section: Methodsmentioning

confidence: 99%

Shortcutted Commonsense: Data Spuriousness in Deep Learning of Commonsense Reasoning

Branco¹,

Branco²,

Rodrigues³

et al. 2021

Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

View full text Add to dashboard Cite

show abstract

“…With a looser definition, the search space includes more candidate adversarial examples. The more candidates there are, the more likely the search is to find an example that fools the victim model -thereby achieving a higher attack success rate (Morris et al, 2020b).…”

Section: Elements Of a Search Processmentioning

confidence: 99%

“…Constraints: Morris et al (2020b) proposed a set of linguistic constraints to enforce that x and perturbed x 0 should be similar in both meaning and fluency to make x 0 a valid potential adversarial example. This indicates that the search space should ensure x and x 0 are close in semantic embedding space.…”

Section: Defining Search Spacesmentioning

confidence: 99%

See 1 more Smart Citation

Searching for a Search Method: Benchmarking Search Algorithms for Generating NLP Adversarial Examples

Yoo¹,

Morris²,

Lifland³

et al. 2020

Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP

Self Cite

View full text Add to dashboard Cite

We study the behavior of several black-box search algorithms used for generating adversarial examples for natural language processing (NLP) tasks. We perform a fine-grained analysis of three elements relevant to search: search algorithm, search space, and search budget. When new search algorithms are proposed in past work, the attack search space is often modified alongside the search algorithm. Without ablation studies benchmarking the search algorithm change with the search space held constant, one cannot tell if an increase in attack success rate is a result of an improved search algorithm or a less restrictive search space. Additionally, many previous studies fail to properly consider the search algorithms' run-time cost, which is essential for downstream tasks like adversarial training. Our experiments provide a reproducible benchmark of search algorithms across a variety of search spaces and query budgets to guide future research in adversarial NLP. Based on our experiments, we recommend greedy attacks with word importance ranking when under a time constraint or attacking long inputs, and either beam search or particle swarm optimization otherwise.

show abstract

“…However, recent work has shown that their performance can be easily undermined with adversarial examples that would pose no confusion for humans . As an increasing number of successful adversarial attackers have been developed for NLP tasks, the quality of the adversarial examples they generate has been questioned (Morris et al, 2020).…”

Section: Introductionmentioning

confidence: 99%

A Closer Look into the Robustness of Neural Dependency Parsers Using Better Adversarial Examples

Wang¹,

Che²,

Titov³

et al. 2021

Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

View full text Add to dashboard Cite

Previous work on adversarial attacks on dependency parsers has mostly focused on attack methods, as opposed to the quality of adversarial examples, which in previous work has been relatively low. To address this gap, we propose a method to generate high-quality adversarial examples with a higher number of candidate generators and stricter filters, and then verify their quality using automatic and human evaluations. We perform analysis with different parsing models and observe that: (i) injecting words not used in the training stage is an effective attack strategy; (ii) adversarial examples generated against a parser strongly depend on the parser model, the token embeddings, and even the specific instantiation of the model (i.e., a random seed). We use these insights to improve the robustness of English parsing models, relying on adversarial training and model ensembling. 1 * Work partially done while at the University of Edinburgh.

show abstract

Reevaluating Adversarial Examples in Natural Language

Cited by 60 publications

References 28 publications

Shortcutted Commonsense: Data Spuriousness in Deep Learning of Commonsense Reasoning

Shortcutted Commonsense: Data Spuriousness in Deep Learning of Commonsense Reasoning

Searching for a Search Method: Benchmarking Search Algorithms for Generating NLP Adversarial Examples

A Closer Look into the Robustness of Neural Dependency Parsers Using Better Adversarial Examples

Contact Info

Product

Resources

About