Scaling Adversarial Training to Large Perturbation Bounds

Addepalli, Sravanti; Jain, Samyak; Sriramanan, Gaurang; Babu, R. Venkatesh

doi:10.1007/978-3-031-20065-6_18

Cited by 11 publications

(7 citation statements)

References 19 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…in which x is the adversarial data generated via PGD within the -ball centered at x and d(•, •) : V × V → R is a distance function, such as the Kullback-Leibler (KL) divergence , the Jensen-Shannon (JS) divergence (Addepalli et al, 2022), and the optimal transport (OT) distance (Zhang and Wang, 2019). We denote the RD on the unlabeled set X as L RD (X; θ) = xi∈X RD (x i ; θ).…”

Section: Representational Divergence (Rd)mentioning

confidence: 99%

“…This section provides the results of ACL with RCS using different distance functions including the KL divergence , the JS divergence (Addepalli et al, 2022), and the OT distance (Zhang and Wang, 2019) for calculating the RD L RD (•). Other training settings exactly keep the same as Section 4.1.…”

Section: B5 Efficient Acl Via Rcs With Various Distance Functions φmentioning

confidence: 99%

See 1 more Smart Citation

Efficient Adversarial Contrastive Learning via Robustness-Aware Coreset Selection

Xu¹,

Zhang²,

Liu³

et al. 2023

Preprint

View full text Add to dashboard Cite

Adversarial contrastive learning (ACL) does not require expensive data annotations but outputs a robust representation that withstands adversarial attacks and also generalizes to a wide range of downstream tasks. However, ACL needs tremendous running time to generate the adversarial variants of all training data, which limits its scalability to large datasets. To speed up ACL, this paper proposes a robustness-aware coreset selection (RCS) method. RCS does not require label information and searches for an informative subset that minimizes a representational divergence, which is the distance of the representation between natural data and their virtual adversarial variants. The vanilla solution of RCS via traversing all possible subsets is computationally prohibitive. Therefore, we theoretically transform RCS into a surrogate problem of submodular maximization, of which the greedy search is an efficient solution with an optimality guarantee for the original problem. Empirically, our comprehensive results corroborate that RCS can speed up ACL by a large margin without significantly hurting the robustness and standard transferability. Notably, to the best of our knowledge, we are the first to conduct ACL efficiently on the large-scale ImageNet-1K dataset to obtain an effective robust representation via RCS.

show abstract

Section: Representational Divergence (Rd)mentioning

confidence: 99%

Section: B5 Efficient Acl Via Rcs With Various Distance Functions φmentioning

confidence: 99%

Efficient Adversarial Contrastive Learning via Robustness-Aware Coreset Selection

Xu¹,

Zhang²,

Liu³

et al. 2023

Preprint

View full text Add to dashboard Cite

show abstract

“…For the term g(w c , (k) , t = t) 2−q to become o(1) at the time of convergence, 1+o (1) 2n tσ 2 − (q − 2)σ 2−q 0 σ 2−q should be constant. Equating the L.H.S.…”

Section: Convergence Time Of Noisy Patchesmentioning

confidence: 99%

“…Using this, we obtain a strong benchmark for ID generalization as shown in Table-1. However, as shown in prior works [1], the impact of augmentations in training is limited by the capacity of the network in being able to generalize well to the diverse augmented data distribution. Therefore, increasing the diversity of training data demands the use of larger model capacities to achieve optimal performance.…”

Section: Introductionmentioning

confidence: 99%

DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks

Jain¹,

Addepalli²,

Sahu³

et al. 2023

Preprint

View full text Add to dashboard Cite

Generalization of Neural Networks is crucial for deploying them safely in the real world. Common training strategies to improve generalization involve the use of data augmentations, ensembling and model averaging. In this work, we first establish a surprisingly simple but strong benchmark for generalization which utilizes diverse augmentations within a training minibatch, and show that this can learn a more balanced distribution of features. Further, we propose Diversify-Aggregate-Repeat Training (DART) strategy that first trains diverse models using different augmentations (or domains) to explore the loss basin, and further Aggregates their weights to combine their expertise and obtain improved generalization. We find that Repeating the step of Aggregation throughout training improves the overall optimization trajectory and also ensures that the individual models have a sufficiently low loss barrier to obtain improved generalization on combining them. We shed light on our approach by casting it in the framework proposed by Shen et al. [61] and theoretically show that it indeed generalizes better. In addition to improvements in In-Domain generalization, we demonstrate SOTA performance on the Domain Generalization benchmarks in the popular DomainBed framework as well. Our method is generic and can easily be integrated with several base training algorithms to achieve performance gains.

show abstract

“…Adversarial training serves as the foundation for various defense methods, including those employing strong data augmentation [31], auxiliary data for primary task robustness [32], and class-fairness considerations [33]. Despite the success of these adversarial defense methods, they mainly focus on the single-mode setting while ignoring the fact that real-world datasets usually have large intra-variations or multiple modes depending on data labeling.…”

Section: Introductionmentioning

confidence: 99%

Towards Adversarial Robustness for Multi-Mode Data through Metric Learning

Khan

Chen

Liao

et al. 2023

Sensors

View full text Add to dashboard Cite

Adversarial attacks have become one of the most serious security issues in widely used deep neural networks. Even though real-world datasets usually have large intra-variations or multiple modes, most adversarial defense methods, such as adversarial training, which is currently one of the most effective defense methods, mainly focus on the single-mode setting and thus fail to capture the full data representation to defend against adversarial attacks. To confront this challenge, we propose a novel multi-prototype metric learning regularization for adversarial training which can effectively enhance the defense capability of adversarial training by preventing the latent representation of the adversarial example changing a lot from its clean one. With extensive experiments on CIFAR10, CIFAR100, MNIST, and Tiny ImageNet, the evaluation results show the proposed method improves the performance of different state-of-the-art adversarial training methods without additional computational cost. Furthermore, besides Tiny ImageNet, in the multi-prototype CIFAR10 and CIFAR100 where we reorganize the whole datasets of CIFAR10 and CIFAR100 into two and ten classes, respectively, the proposed method outperforms the state-of-the-art approach by 2.22% and 1.65%, respectively. Furthermore, the proposed multi-prototype method also outperforms its single-prototype version and other commonly used deep metric learning approaches as regularization for adversarial training and thus further demonstrates its effectiveness.

show abstract

Scaling Adversarial Training to Large Perturbation Bounds

Cited by 11 publications

References 19 publications

Efficient Adversarial Contrastive Learning via Robustness-Aware Coreset Selection

Efficient Adversarial Contrastive Learning via Robustness-Aware Coreset Selection

DART: Diversify-Aggregate-Repeat Training Improves Generalization of Neural Networks

Towards Adversarial Robustness for Multi-Mode Data through Metric Learning

Contact Info

Product

Resources

About