2022 International Joint Conference on Neural Networks (IJCNN) 2022
DOI: 10.1109/ijcnn55064.2022.9892788
|View full text |Cite
|
Sign up to set email alerts
|

Resilience of Bayesian Layer-Wise Explanations under Adversarial Attacks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
5
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
4
1

Relationship

0
5

Authors

Journals

citations
Cited by 5 publications
(5 citation statements)
references
References 19 publications
0
5
0
Order By: Relevance
“…Aside from the problem of gradient masking, the distinction between category I and II attacks is also useful because certain types of neural networks can be shown to be immune to gradient-based attacks. Carbone et al [52], for example, obtained a very interesting result proving that Bayesian neural networks are immune to gradient-based adversarial attacks in the infinite data limit, because the gradient of the loss with respect to any sample from the data distribution is zero. This result also seems to hold approximately for Bayesian neural networks trained on finite data.…”
Section: Randomized Gradient-free Attackmentioning
confidence: 99%
“…Aside from the problem of gradient masking, the distinction between category I and II attacks is also useful because certain types of neural networks can be shown to be immune to gradient-based attacks. Carbone et al [52], for example, obtained a very interesting result proving that Bayesian neural networks are immune to gradient-based adversarial attacks in the infinite data limit, because the gradient of the loss with respect to any sample from the data distribution is zero. This result also seems to hold approximately for Bayesian neural networks trained on finite data.…”
Section: Randomized Gradient-free Attackmentioning
confidence: 99%
“…Classification performance has not significantly improved despite the widespread use of de-noiser models to reduce adversarial noise [159]. For example, Carbone et al demonstrate the stability of saliency-based explanations of Neural Network predictions under adversarial attacks in a classification task [161]. The authors implement a gradient-based XAI method using Bayesian Neural Networks, which is considerably more stable under adversarial perturbations of the inputs and even under direct attacks on the explanations.…”
Section: Gradient-based Adversarial Explanationmentioning
confidence: 99%
“…The goal of adversarial training is to incorporate the adversarial search within the training process and, thus, realize robustness against adversarial examples at test time. In particular, recently, Bayesian adversarial learning has been investigated and adopted in the computer vision domain to propose to improve the robustness of models against adversarial examples (Liu and Wang 2016;Ye and Zhu 2018;Liu et al 2019;Wicker et al 2021;Carbone et al 2020;Doan et al 2022).…”
Section: Background and Related Workmentioning
confidence: 99%
“…To construct a formulation to improve the robustness against feature-space adversarial malware examples, and ultimately problem space malware, we propose a Bayesian formulation for adversarially training a neural network: i) with the capability to capture the distribution of models to improve robustness (Liu and Wang 2016;Liu et al 2019;Ye and Zhu 2018;Wicker et al 2021;Carbone et al 2020;Doan et al 2022); and ii) prove our proposed method of diversified Bayesian neural networks hardened with adversarial training bounds the difference between the adversarial risk and the conventional empirical risk to theoretically explain the improved robustness.…”
Section: Introductionmentioning
confidence: 99%