ConFoc: Content-Focus Protection Against Trojan Attacks on Neural Networks

Villarreal-Vasquez, Miguel; Bhargava, Bharat

doi:10.48550/arxiv.2007.00711

Cited by 17 publications

(24 citation statements)

References 31 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…For the test-stage defense, a line of existing works use clean validation data to retrain a victim model, forcing it to forget the malicious statistics (Liu et al, 2018). Kolouri et al (2020); Huang et al (2020); Villarreal-Vasquez & Bhargava (2020) craft training data respectively with the clean pattern and the malicious pattern so that the model can better detect the boundary of clean data and poisoned data. Another line of methods rely on post-hoc tools to detect potential triggers.…”

Section: Defenses Against Backdoor Attacksmentioning

confidence: 99%

A General Framework for Defending Against Backdoor Attacks via Influence Graph

Sun¹,

Li²,

Li³

et al. 2021

Preprint

View full text Add to dashboard Cite

In this work, we propose a new and general framework to defend against backdoor attacks, inspired by the fact that attack triggers usually follow a SPECIFIC type of attacking pattern, and therefore, poisoned training examples have greater impacts on each other during training. We introduce the notion of the influence graph, which consists of nodes and edges respectively representative of individual training points and associated pair-wise influences. The influence between a pair of training points represents the impact of removing one training point on the prediction of another, approximated by the influence function (Koh & Liang, 2017). Malicious training points are extracted by finding the maximum average sub-graph subject to a particular size. Extensive experiments on computer vision and natural language processing tasks demonstrate the effectiveness and generality of the proposed framework.

show abstract

Section: Defenses Against Backdoor Attacksmentioning

confidence: 99%

A General Framework for Defending Against Backdoor Attacks via Influence Graph

Sun¹,

Li²,

Li³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…To our surprise, SRA is naturally resistant to a considerable part of these defenses (Neural Cleanse (NC) [68] as an example). Besides those inspection based defenses, we also consider preprossing-based defenses [18,32,39,47,66,67], which are somehow more compatible with the spirit of deployment-stage defenses. However, we find that the additional overheads and clean accuracy loss that may be induced by these methods could be intolerable.…”

Section: Defensive Analysismentioning

confidence: 99%

Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks

Xie

Pan

et al. 2021

Preprint

View full text Add to dashboard Cite

One major goal of the AI security community is to securely and reliably produce and deploy deep learning models for real-world applications. To this end, data poisoning based backdoor attacks on deep neural networks (DNNs) in the production stage (or training stage) and corresponding defenses are extensively explored in recent years. Ironically, backdoor attacks in the deployment stage, which can often happen in unprofessional users' devices and are thus arguably far more threatening in real-world scenarios, draw much less attention of the community. We attribute this imbalance of vigilance to the weak practicality of existing deployment-stage backdoor attack algorithms and the insufficiency of real-world attack demonstrations. To fill the blank, in this work, we study the realistic threat of deployment-stage backdoor attacks on DNNs. We base our study on a commonly used deployment-stage attack paradigm -adversarial weight attack, where adversaries selectively modify model weights to embed backdoor into deployed DNNs. To approach realistic practicality, we propose the first gray-box and physically realizable weights attack algorithm for backdoor injection, namely subnet replacement attack (SRA), which only requires architecture information of the victim model and can support physical triggers in the real world. Extensive experimental simulations and system-level real-world attack demonstrations are conducted. Our results not only suggest the effectiveness and practicality of the proposed attack algorithm, but also reveal the practical risk of a novel type of computer virus that may widely spread and stealthily inject backdoor into DNN models in user devices. By our study, we call for more attention to the vulnerability of DNNs in the deployment stage.

show abstract

“…First, the defense can only be executed at the server side where only local gradients are available. This invalids many backdoor defense methods developed in centralized machine learning, for example, denoising (preprocessing) methods [33], [34], [35], [36], [37], backdoor sample/trigger detection methods [38], [39], [40], [41], [42], [43], robust data augmentations [44], and finetuning methods [44]. Second, the defense method has to be robust to both data poisoning and model poisoning attacks (e.g., Byzantine, backdoor and Sybil attacks).…”

Section: Secure Flmentioning

confidence: 99%

Privacy and Robustness in Federated Learning: Attacks and Defenses

Lyu¹,

Han²,

Ma³

et al. 2020

Preprint

View full text Add to dashboard Cite

As data are increasingly being stored in different silos and societies becoming more aware of data privacy issues, the traditional centralized training of artificial intelligence (AI) models is facing efficiency and privacy challenges. Recently, federated learning (FL) has emerged as an alternative solution and continue to thrive in this new reality. Existing FL protocol design has been shown to be vulnerable to adversaries within or outside of the system, compromising data privacy and system robustness. Besides training powerful global models, it is of paramount importance to design FL systems that have privacy guarantees and are resistant to different types of adversaries. In this paper, we conduct the first comprehensive survey on this topic. Through a concise introduction to the concept of FL, and a unique taxonomy covering: 1) threat models; 2) poisoning attacks and defenses against robustness; 3) inference attacks and defenses against privacy, we provide an accessible review of this important topic. We highlight the intuitions, key techniques as well as fundamental assumptions adopted by various attacks and defenses. Finally, we discuss promising future research directions towards robust and privacy-preserving federated learning.

show abstract

ConFoc: Content-Focus Protection Against Trojan Attacks on Neural Networks

Cited by 17 publications

References 31 publications

A General Framework for Defending Against Backdoor Attacks via Influence Graph

A General Framework for Defending Against Backdoor Attacks via Influence Graph

Towards Practical Deployment-Stage Backdoor Attack on Deep Neural Networks

Privacy and Robustness in Federated Learning: Attacks and Defenses

Contact Info

Product

Resources

About