Certified Robustness Against Natural Language Attacks by Causal Intervention

Zhao, Haiteng; Ma*, Chang; Dong, Xinshuai; Luu, Anh Tuan; Deng, Zhihong; Zhang, Hanwang

doi:10.48550/arxiv.2205.12331

Cited by 3 publications

(4 citation statements)

References 0 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Some include video moment retrieval [3], visual semantic segmentation [4], and visual dialog [5]. For NLP, causal inference has attracted myriad attention as a method to interpret adversarial attacks [6] and eradicate spurious confounding factors in SGD optimizer [7].…”

Section: A Causal Intervention For Deep Learningmentioning

confidence: 99%

Topic-Aware Causal Intervention for Counterfactual Detection

Nguyen,

Wu,

Nguyen

et al. 2023

Preprint

View full text Add to dashboard Cite

Counterfactual statements, which describe events that did not or cannot take place, are beneficial to numerous NLP applications. Hence, we consider the problem of counterfactual detection (CFD) and seek to enhance the CFD models. Previous models are reliant on clue phrases to predict counterfactuality, so they suffer from significant performance drop when clue phrase hints do not exist during testing. Moreover, these models’ prediction also biases towards non-counterfactual over the counterfactual class. To address these issues, we propose to integrate neural topic model into the CFD model to capture the global semantics of the input statement. We continue to causally intervene the hidden representations of the CFD model to balance the effect of the class labels. Extensive experiments show that our approach outperforms previous state-of-the-art CFD and bias-resolving methods in both the CFD and other bias-sensitive tasks.

show abstract

Section: A Causal Intervention For Deep Learningmentioning

confidence: 99%

Topic-Aware Causal Intervention for Counterfactual Detection

Nguyen,

Wu,

Nguyen

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…Zhao et al [307] take a causal perspective on the natural language attack problem and frame the source of adversarial vulnerability as the spurious association induced by confounders. Fig.…”

Section: Certified Robustness Against Natural Language Attacksmentioning

confidence: 99%

“…To defend models against such attacks, Zhao et al [307] show that a Gaussianbased randomized classifier models the interventional distribution p(y | do(x)) and is therefore robust against l 2 -bounded attacks. However, textual input spaces are not continuous and text substitutions do not follow Gaussian distributions.…”

Section: Counterfactual Explanationsmentioning

confidence: 99%

Causal Machine Learning: A Survey and Open Problems

Jean¹,

Lynch²,

Liu³

et al. 2022

Preprint

View full text Add to dashboard Cite

Causal Machine Learning (CausalML) is an umbrella term for machine learning methods that formalize the data-generation process as a structural causal model (SCM). This allows one to reason about the effects of changes to this process (i.e., interventions) and what would have happened in hindsight (i.e., counterfactuals). We categorize work in CausalML into five groups according to the problems they tackle: (1) causal supervised learning, (2) causal generative modeling, (3) causal explanations, (4) causal fairness, (5) causal reinforcement learning. For each category, we systematically compare its methods and point out open problems. Further, we review modality-specific applications in computer vision, natural language processing, and graph representation learning. Finally, we provide an overview of causal benchmarks and a critical discussion of the state of this nascent field, including recommendations for future work.

show abstract

“…First-order approximation [2], l 2 -ball [3], [53] [50], and axis-aligned bound [4] and [52] are popular ways to model perturbations. Axisaligned bound can also be combined with randomized smoothing techniques [74]. Our method is different from previous methods as we use convex hulls to model the attack space and capture the geometry of word substitutions more precisely.…”

Section: Related Workmentioning

confidence: 99%

Adversarial attacks and defenses in natural language processing

Dong¹

Self Cite

View full text Add to dashboard Cite

Deep neural networks (DNNs) are becoming increasingly successful in many fields. However, DNNs are shown to be strikingly susceptible to adversarial examples. For instance, models pre-trained on very large corpora can still be easily fooled by word substitution attacks using only synonyms. This phenomenon has raised grand security challenges to modern machine learning systems, such as self-driving, spam filtering, and speech recognition, where DNNs are widely deployed.In this thesis, we first give a brief introduction of adversarial attacks and defenses.We focus on Natural Language Processing (NLP) and review some recent advances in attack algorithms and defense methods in Chapter 2. We also give a formalized definition of the research objective in this thesis, i.e., how to improve the adversarial robustness of NLP models. To this end, we propose novel and effective solutions to enhance NLP models towards robustness in the following chapters.In Chapter 3, for the classical NLP models like Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN), we present a novel adversarial training method, Adversarial Sparse Convex Combination (ASCC) defense, for adversarial robustness against word substitution attacks. To be specific, we model the substitution attack space as a convex hull and employ a regularizer to encourage the modeled perturbation towards an actual substitution. Therefore, we are able to align the modeling better with the discrete textual space. We empirically validate ASCC-defense in our experiments and it surpasses all compared state-of-the-arts on prevailing NLP tasks like sentiment analysis and natural language inference consistently under multiple attacks.To date, pre-trained language models, e.g., Bidirectional Transformers (BERT), are getting increasingly popular and fine-tuning a pre-trained language model for downstream tasks is becoming the new NLP paradigm. As such, how to fine-tune pre-trained Next, I would like to thank my co-supervisor and past supervisors, Prof. Hanwang

show abstract

Certified Robustness Against Natural Language Attacks by Causal Intervention

Cited by 3 publications

References 0 publications

Topic-Aware Causal Intervention for Counterfactual Detection

Topic-Aware Causal Intervention for Counterfactual Detection

Causal Machine Learning: A Survey and Open Problems

Adversarial attacks and defenses in natural language processing

Contact Info

Product

Resources

About