Parameterizing Activation Functions for Adversarial Robustness

Dai, Sihui; Mahloujifar, Saeed; Mittal, Prateek

doi:10.48550/arxiv.2110.05626

Cited by 3 publications

(7 citation statements)

References 15 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Rebuffi et al [32] focus on data augmentation and study the performance of using generative models. There are also insightful work that focuses on model architectures [18,41,3,26,30], batch normalization [43], and activation functions [44,11]. Distilling from adversarially trained models has also been widely studied [49,35,2].…”

Section: Adversarial Training (At)mentioning

confidence: 99%

Collaborative Adversarial Training

Li¹,

Guo²,

Zuo³

et al. 2022

Preprint

View full text Add to dashboard Cite

The vulnerability of deep neural networks (DNNs) to adversarial examples has attracted great attention in the machine learning community. The problem is related to local non-smoothness and steepness of normally obtained loss landscapes. Training augmented with adversarial examples (a.k.a., adversarial training) is considered as an effective remedy. In this paper, we highlight that some collaborative examples, nearly perceptually indistinguishable from both adversarial and benign examples yet show extremely lower prediction loss, can be utilized to enhance adversarial training. A novel method called collaborative adversarial training (CoAT) is thus proposed to achieve new state-of-the-arts. Code will be made publicly available.

show abstract

Section: Adversarial Training (At)mentioning

confidence: 99%

Collaborative Adversarial Training

Li¹,

Guo²,

Zuo³

et al. 2022

Preprint

View full text Add to dashboard Cite

show abstract

“…Recently, Dai et al [6] studied the learnable parametric activation functions and the developed PSSiLU can significantly increase robustness when extra training samples are available. Shao et al [29] showed that vanilla ViTs [7] can learn more generalizable features and thus has superior robustness against adversarial perturbations.…”

Section: B Robust Network Architecturesmentioning

confidence: 99%

Understanding Adversarial Robustness from Feature Maps of Convolutional Layers

Xu¹,

Yang²

2022

Preprint

View full text Add to dashboard Cite

The adversarial robustness of a neural network mainly relies on two factors, one is the feature representation capacity of the network, and the other is its resistance ability to perturbations. In this paper, we study the anti-perturbation ability of the network from the feature maps of convolutional layers. Our theoretical analysis discovers that larger convolutional features before average pooling can contribute to better resistance to perturbations, but the conclusion is not true for max pooling. Based on the theoretical findings, we present two feasible ways to improve the robustness of existing neural networks. The proposed approaches are very simple and only require upsampling the inputs or modifying the stride configuration of convolution operators. We test our approaches on several benchmark neural network architectures, including AlexNet, VGG16, RestNet18 and PreActResNet18, and achieve non-trivial improvements on both natural accuracy and robustness under various attacks. Our study brings new insights into the design of robust neural networks. The code is available at https://github.com/MTandHJ/rcm.

show abstract

“…We evaluate AAA along with 8 defense baselines, including random noise defense (RND [33]), adversarial training (AT [36,59,49]), dynamic inference (DENT [19]), training randomness (PNI [51]), and ensemble (TRS [60]). Results of AT with extra data [22]) are in Appendix A. AAA uses hyper-parameters α = 1, t = 6, β = 5, and an Adam optimizer [61] with learning rate 0.1, β 1 = 0.9, β 2 = 0.999, and the iteration of optimization 100.…”

Section: Setupmentioning

confidence: 99%

“…3 (left). AAA, indicated by the orange lines, changes the scores slightly without hurting accuracy (dotted lines) compared to RND [33] and AT [36]. By such small modifications, AAA not only improves the calibration but also prevents SQAs by misleading them into incorrect directions.…”

Section: Visual Illustration Of Aaamentioning

confidence: 99%

“…By post-processing, AAA not only lowers the calibration error without hurting accuracy in all cases, but also efficiently prevents SQAs, e.g., helping a WideResNet-28 [34] on CIFAR-10 [35] secure 80.59% accuracy under Square attack [5] (2500 queries), while the best prior defense (i.e., adversarial training [36]) only attains 67.44%. Because AAA attacks the general greedy update of SQAs, such advantages of AAA over 8 baseline defenses can be consistently observed on 8 CIFAR-10/ImageNet models under 6 SQAs, using different attack targets, norms, bounds, and losses, verifying AAA a user-friendly, effective, and generalizable defense.…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Adversarial Attack on Attackers: Post-Process to Mitigate Black-Box Score-Based Query Attacks

Chen¹,

Huang²,

Tao³

et al. 2022

Preprint

View full text Add to dashboard Cite

The score-based query attacks (SQAs) pose practical threats to deep neural networks by crafting adversarial perturbations within dozens of queries, only using the model's output scores. Nonetheless, we note that if the loss trend of the outputs is slightly perturbed, SQAs could be easily misled and thereby become much less effective. Following this idea, we propose a novel defense, namely Adversarial Attack on Attackers (AAA), to confound SQAs towards incorrect attack directions by slightly modifying the output logits. In this way, (1) SQAs are prevented regardless of the model's worst-case robustness; (2) the original model predictions are hardly changed, i.e., no degradation on clean accuracy; (3) the calibration of confidence scores can be improved simultaneously. Extensive experiments are provided to verify the above advantages. For example, by setting ∞ = 8/255 on CIFAR-10, our proposed AAA helps WideResNet-28 secure 80.59% accuracy under Square attack (2500 queries), while the best prior defense (i.e., adversarial training) only attains 67.44%. Since AAA attacks SQA's general greedy strategy, such advantages of AAA over 8 defenses can be consistently observed on 8 CIFAR-10/ImageNet models under 6 SQAs, using different attack targets and bounds. Moreover, AAA calibrates better without hurting the accuracy. Our code would be released.Preprint. Under review.

show abstract

Parameterizing Activation Functions for Adversarial Robustness

Cited by 3 publications

References 15 publications

Collaborative Adversarial Training

Collaborative Adversarial Training

Understanding Adversarial Robustness from Feature Maps of Convolutional Layers

Adversarial Attack on Attackers: Post-Process to Mitigate Black-Box Score-Based Query Attacks

Contact Info

Product

Resources

About