2020
DOI: 10.48550/arxiv.2007.00711
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

ConFoc: Content-Focus Protection Against Trojan Attacks on Neural Networks

Miguel Villarreal-Vasquez,
Bharat Bhargava

Abstract: Deep Neural Networks (DNNs) have been applied successfully in computer vision. However, their wide adoption in image-related applications is threatened by their vulnerability to trojan attacks. These attacks insert some misbehavior at training using samples with a mark or trigger, which is exploited at inference or testing time. In this work, we analyze the composition of the features learned by DNNs at training. We identify that they, including those related to the inserted triggers, contain both content (sem… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
24
0

Year Published

2020
2020
2023
2023

Publication Types

Select...
7
2
1

Relationship

0
10

Authors

Journals

citations
Cited by 17 publications
(24 citation statements)
references
References 31 publications
0
24
0
Order By: Relevance
“…For the test-stage defense, a line of existing works use clean validation data to retrain a victim model, forcing it to forget the malicious statistics (Liu et al, 2018). Kolouri et al (2020); Huang et al (2020); Villarreal-Vasquez & Bhargava (2020) craft training data respectively with the clean pattern and the malicious pattern so that the model can better detect the boundary of clean data and poisoned data. Another line of methods rely on post-hoc tools to detect potential triggers.…”
Section: Defenses Against Backdoor Attacksmentioning
confidence: 99%
“…For the test-stage defense, a line of existing works use clean validation data to retrain a victim model, forcing it to forget the malicious statistics (Liu et al, 2018). Kolouri et al (2020); Huang et al (2020); Villarreal-Vasquez & Bhargava (2020) craft training data respectively with the clean pattern and the malicious pattern so that the model can better detect the boundary of clean data and poisoned data. Another line of methods rely on post-hoc tools to detect potential triggers.…”
Section: Defenses Against Backdoor Attacksmentioning
confidence: 99%
“…To our surprise, SRA is naturally resistant to a considerable part of these defenses (Neural Cleanse (NC) [68] as an example). Besides those inspection based defenses, we also consider preprossing-based defenses [18,32,39,47,66,67], which are somehow more compatible with the spirit of deployment-stage defenses. However, we find that the additional overheads and clean accuracy loss that may be induced by these methods could be intolerable.…”
Section: Defensive Analysismentioning
confidence: 99%
“…First, the defense can only be executed at the server side where only local gradients are available. This invalids many backdoor defense methods developed in centralized machine learning, for example, denoising (preprocessing) methods [33], [34], [35], [36], [37], backdoor sample/trigger detection methods [38], [39], [40], [41], [42], [43], robust data augmentations [44], and finetuning methods [44]. Second, the defense method has to be robust to both data poisoning and model poisoning attacks (e.g., Byzantine, backdoor and Sybil attacks).…”
Section: Secure Flmentioning
confidence: 99%