Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery &Amp; Data Mining 2020
DOI: 10.1145/3394486.3403044
|View full text |Cite
|
Sign up to set email alerts
|

Interpretability is a Kind of Safety: An Interpreter-based Ensemble for Adversary Defense

Abstract: While having achieved great success in rich real-life applications, deep neural network (DNN) models have long been criticized for their vulnerability to adversarial attacks. Tremendous research efforts have been dedicated to mitigating the threats of adversarial attacks, but the essential trait of adversarial examples is not yet clear, and most existing methods are yet vulnerable to hybrid attacks and suffer from counterattacks. In light of this, in this paper, we first reveal a gradient-based correlation bet… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1

Citation Types

0
1
0

Year Published

2021
2021
2024
2024

Publication Types

Select...
3
3

Relationship

0
6

Authors

Journals

citations
Cited by 10 publications
(1 citation statement)
references
References 30 publications
(68 reference statements)
0
1
0
Order By: Relevance
“…In this paper, we further explore the premise of using explanations, but to detect adversarial examples and based on a fundamentally different approach. Our work was done concurrently with a similar approach presented by Wang et al [48]. In contrast, we have the following differentiating aspects.…”
Section: Existing Work On Adversarial Detectionmentioning
confidence: 99%
“…In this paper, we further explore the premise of using explanations, but to detect adversarial examples and based on a fundamentally different approach. Our work was done concurrently with a similar approach presented by Wang et al [48]. In contrast, we have the following differentiating aspects.…”
Section: Existing Work On Adversarial Detectionmentioning
confidence: 99%