Improving Adversarial Robustness via Probabilistically Compact Loss with Logit Constraints

Li, Xin; Li, Xiangrui; Deng, Pan; Zhu, Dongxiao

doi:10.1609/aaai.v35i10.17030

Cited by 9 publications

(5 citation statements)

References 33 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Defensive Detectors. There are a variety of ways to have a defense model, including input transformation (Guo et al, 2017), adversarial training (Pang et al, 2020), and improved loss functions (Li et al, 2021). Notice the fact that we want to compare different attack methods on a defense model, using adversarial training is not reasonable enough.…”

Section: Additional Studymentioning

confidence: 99%

“…Then, we take two methods that are more often used in adversarial defense. We trained two robust detectors using Faster RCNN including the input manipulation of JPEG (Guo et al, 2017) and an improved loss function of Probabilistically Compact Loss (PC Loss) (Li et al, 2021). The input compression by JPEG is to neutralize the influence of adversarial noises and the usage of PC Loss instead of Cross Entropy Loss is mainly to enlarge the gaps of classification probabilities and therefore strengthen the robustness.…”

Section: Additional Studymentioning

confidence: 99%

See 1 more Smart Citation

School of Mechanical Engineering, Shanghai Jiao Tong University, 200240, Shanghai, China

Zhuang

Huang

et al. 2019

Mathematical Biosciences and Engineering

View full text Add to dashboard Cite

The crystallization kinetics and melting behavior of nylon 10,10 in neat nylon 10,10 and in nylon 10,10 -montmorillonite (MMT) nanocomposites were systematically investigated by differential scanning calorimetry. The crystallization kinetics results show that the addition of MMT facilitated the crystallization of nylon 10,10 as a heterophase nucleating agent; however, when the content of MMT was high, the physical hindrance of MMT layers to the motion of nylon 10,10 chains retarded the crystallization of nylon 10,10, which was also confirmed by polarized optical microscopy. However, both nylon 10,10 and nylon 10,10 -MMT nanocomposites exhibited multiple melting be-havior under isothermal and nonisothermal crystallization conditions. The temperature of the lower melting peak (peak I) was independent of MMT content and almost remained constant; however, the temperature of the highest melting peak (peak II) decreased with increasing MMT content due to the physical hindrance of MMT layers to the motion of nylon 10,10 chains.

show abstract

Section: Additional Studymentioning

confidence: 99%

Section: Additional Studymentioning

confidence: 99%

School of Mechanical Engineering, Shanghai Jiao Tong University, 200240, Shanghai, China

Zhuang

Huang

et al. 2019

Mathematical Biosciences and Engineering

View full text Add to dashboard Cite

show abstract

“…Given that the angles between the feature vector and weight vectors contain abundant discriminative information [10,16,17] and adversarial attacks attack these angles, we propose a regularization term that directly encourages the weight-feature compactness, more specifically, by minimizing the angle between adversarial feature vector and the weight vector corresponding to the ground-truth label y. In addition, prior works [18] have argued strong connections between adversarial robustness and inter-class separability. We therefore propose an additional angular-based regularization term that improves the inter-class separability.…”

Section: Proposed Methodsmentioning

confidence: 99%

Improving Adversarial Robustness with Hypersphere Embedding and Angular-based Regularizations

Fakorede¹,

Nirala²,

Modeste³

et al. 2023

Preprint

View full text Add to dashboard Cite

Adversarial training (AT) methods have been found to be effective against adversarial attacks on deep neural networks. Many variants of AT have been proposed to improve its performance. Pang et al. [1] have recently shown that incorporating hypersphere embedding (HE) into the existing AT procedures enhances robustness. We observe that the existing AT procedures are not designed for the HE framework, and thus fail to adequately learn the angular discriminative information available in the HE framework. In this paper, we propose integrating HE into AT with regularization terms that exploit the rich angular information available in the HE framework. Specifically, our method, termed angular-AT, adds regularization terms to AT that explicitly enforce weight-feature compactness and inter-class separation; all expressed in terms of angular features. Experimental results show that angular-AT further improves adversarial robustness.

show abstract

“…(Ioffe and Szegedy 2015) proposes batch normalization (BN) to reduce the internal covariate shift caused by SGD. For image classification, data-augmentation types of regularization are also developed (DeVries and Taylor 2017; Gastaldi 2017; Li et al , 2021. Different from those approaches, our proposed ITRA is motivated by the perspective of exact gradient update for each mini-batch in SDG training, and achieves regularization by encouraging the alignment of feature representations of different mini-batches.…”

Section: Related Workmentioning

confidence: 99%

Learning Compact Features via In-Training Representation Alignment

Pan

et al. 2023

AAAI

View full text Add to dashboard Cite

Deep neural networks (DNNs) for supervised learning can be viewed as a pipeline of the feature extractor (i.e., last hidden layer) and a linear classifier (i.e., output layer) that are trained jointly with stochastic gradient descent (SGD) on the loss function (e.g., cross-entropy). In each epoch, the true gradient of the loss function is estimated using a mini-batch sampled from the training set and model parameters are then updated with the mini-batch gradients. Although the latter provides an unbiased estimation of the former, they are subject to substantial variances derived from the size and number of sampled mini-batches, leading to noisy and jumpy updates. To stabilize such undesirable variance in estimating the true gradients, we propose In-Training Representation Alignment (ITRA) that explicitly aligns feature distributions of two different mini-batches with a matching loss in the SGD training process. We also provide a rigorous analysis of the desirable effects of the matching loss on feature representation learning: (1) extracting compact feature representation; (2) reducing over-adaption on mini-batches via an adaptively weighting mechanism; and (3) accommodating to multi-modalities. Finally, we conduct large-scale experiments on both image and text classifications to demonstrate its superior performance to the strong baselines.

show abstract

Improving Adversarial Robustness via Probabilistically Compact Loss with Logit Constraints

Cited by 9 publications

References 33 publications

School of Mechanical Engineering, Shanghai Jiao Tong University, 200240, Shanghai, China

School of Mechanical Engineering, Shanghai Jiao Tong University, 200240, Shanghai, China

Improving Adversarial Robustness with Hypersphere Embedding and Angular-based Regularizations

Learning Compact Features via In-Training Representation Alignment

Contact Info

Product

Resources

About