2019 IEEE Symposium on Security and Privacy (SP) 2019
DOI: 10.1109/sp.2019.00031
|View full text |Cite
|
Sign up to set email alerts
|

Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks

Abstract: Lack of transparency in deep neural networks (DNNs) make them susceptible to backdoor attacks, where hidden associations or triggers override normal classification to produce unexpected results. For example, a model with a backdoor always identifies a face as Bill Gates if a specific symbol is present in the input. Backdoors can stay hidden indefinitely until activated by an input, and present a serious security risk to many security or safety related applications, e.g., biometric authentication systems or sel… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

8
1,203
1
1

Year Published

2019
2019
2024
2024

Publication Types

Select...
4
4
1

Relationship

1
8

Authors

Journals

citations
Cited by 989 publications
(1,273 citation statements)
references
References 28 publications
8
1,203
1
1
Order By: Relevance
“…It should be noted that Activation Clustering [11] requires the full training data (both clean and poisoned) while Neuron Cleanse [50] and Fine-Pruning [29] require a subset of the clean training data.…”
Section: Backdoor Attacks On Dnnmentioning
confidence: 99%
See 1 more Smart Citation
“…It should be noted that Activation Clustering [11] requires the full training data (both clean and poisoned) while Neuron Cleanse [50] and Fine-Pruning [29] require a subset of the clean training data.…”
Section: Backdoor Attacks On Dnnmentioning
confidence: 99%
“…Digit. This application is commonly used in studying DNN vulnerabilities including normal backdoors [19,50]. Both Teacher and Student tasks are to recognize hand-written digits, where Teacher Table 1: Summary of tasks, models, and datasets used in our evaluation using four tasks.…”
Section: Experiments Setupmentioning
confidence: 99%
“…Therefore, our countermeasure is performed at run-time when the (backdoored or benign) model is already actively deployed in the field and in a black-box setting. 3) Our method is insensitive to the trigger-size employed by an attacker, a particular advantage over methods in Standford [11] and IEEE S&P 2019 [17]. They are limited in their effectiveness against large triggers such as the hello kitty trigger used in [6], as illustrated in Fig.…”
Section: A Our Contributions and Resultsmentioning
confidence: 99%
“…For all 25 DNNs being attacked, the maximum reciprocals are much larger than for the 25 clean DNNs. This detector achieves outstanding detection performance -much better than an earlier detector [90]. All 25 attacks are successfully detected, among which, both source and target classes used for devising the attack are correctly inferred for 23 out of 25 attack instances; for the other two attack instances, only the target class is correctly inferred.…”
Section: Backdoor Detection Without the Training Setmentioning
confidence: 96%