2021
DOI: 10.48550/arxiv.2110.05626
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Parameterizing Activation Functions for Adversarial Robustness

Abstract: Deep neural networks are known to be vulnerable to adversarially perturbed inputs. A commonly used defense is adversarial training, whose performance is influenced by model capacity. While previous works have studied the impact of varying model width and depth on robustness, the impact of increasing capacity by using learnable parametric activation functions (PAFs) has not been studied. We study how using learnable PAFs can improve robustness in conjunction with adversarial training. We first ask the question:… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
7
0

Year Published

2022
2022
2022
2022

Publication Types

Select...
3

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(7 citation statements)
references
References 15 publications
0
7
0
Order By: Relevance
“…Rebuffi et al [32] focus on data augmentation and study the performance of using generative models. There are also insightful work that focuses on model architectures [18,41,3,26,30], batch normalization [43], and activation functions [44,11]. Distilling from adversarially trained models has also been widely studied [49,35,2].…”
Section: Adversarial Training (At)mentioning
confidence: 99%
“…Rebuffi et al [32] focus on data augmentation and study the performance of using generative models. There are also insightful work that focuses on model architectures [18,41,3,26,30], batch normalization [43], and activation functions [44,11]. Distilling from adversarially trained models has also been widely studied [49,35,2].…”
Section: Adversarial Training (At)mentioning
confidence: 99%
“…Recently, Dai et al [6] studied the learnable parametric activation functions and the developed PSSiLU can significantly increase robustness when extra training samples are available. Shao et al [29] showed that vanilla ViTs [7] can learn more generalizable features and thus has superior robustness against adversarial perturbations.…”
Section: B Robust Network Architecturesmentioning
confidence: 99%
“…We evaluate AAA along with 8 defense baselines, including random noise defense (RND [33]), adversarial training (AT [36,59,49]), dynamic inference (DENT [19]), training randomness (PNI [51]), and ensemble (TRS [60]). Results of AT with extra data [22]) are in Appendix A. AAA uses hyper-parameters α = 1, t = 6, β = 5, and an Adam optimizer [61] with learning rate 0.1, β 1 = 0.9, β 2 = 0.999, and the iteration of optimization 100.…”
Section: Setupmentioning
confidence: 99%
“…3 (left). AAA, indicated by the orange lines, changes the scores slightly without hurting accuracy (dotted lines) compared to RND [33] and AT [36]. By such small modifications, AAA not only improves the calibration but also prevents SQAs by misleading them into incorrect directions.…”
Section: Visual Illustration Of Aaamentioning
confidence: 99%
See 1 more Smart Citation