EGM: An Efficient Generative Model for Unrestricted Adversarial Examples

Xiang, Tao; Liu, Hangcheng; Guo, Shangwei; Gan, Yan; Liao, Xiaofeng

doi:10.1145/3511893

Cited by 8 publications

(2 citation statements)

References 44 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…To ensure the efficiency of generating attacks, generative model-based attack methods (Hayes and Danezis 2017;Poursaeed et al 2017;Liu et al 2019;Xiang et al 2022;Qiu et al 2019;Salzmann et al 2021;Aich et al 2022) have been extensively studied. For example, Aishan et al (Hayes and Danezis 2017) train a generative network capable of generating universal perturbations to fool a target classifier.…”

Section: Related Work Adversarial Attackmentioning

confidence: 99%

Mutual-Modality Adversarial Attack with Semantic Perturbation

Ye,

Yu,

Liu

et al. 2024

AAAI

View full text Add to dashboard Cite

Adversarial attacks constitute a notable threat to machine learning systems, given their potential to induce erroneous predictions and classifications. However, within real-world contexts, the essential specifics of the deployed model are frequently treated as a black box, consequently mitigating the vulnerability to such attacks. Thus, enhancing the transferability of the adversarial samples has become a crucial area of research, which heavily relies on selecting appropriate surrogate models. To address this challenge, we propose a novel approach that generates adversarial attacks in a mutual-modality optimization scheme. Our approach is accomplished by leveraging the pre-trained CLIP model. Firstly, we conduct a visual attack on the clean image that causes semantic perturbations on the aligned embedding space with the other textual modality. Then, we apply the corresponding defense on the textual modality by updating the prompts, which forces the re-matching on the perturbed embedding space. Finally, to enhance the attack transferability, we utilize the iterative training strategy on the visual attack and the textual defense, where the two processes optimize from each other. We evaluate our approach on several benchmark datasets and demonstrate that our mutual-modal attack strategy can effectively produce high-transferable attacks, which are stable regardless of the target networks. Our approach outperforms state-of-the-art attack methods and can be readily deployed as a plug-and-play solution.

show abstract

Section: Related Work Adversarial Attackmentioning

confidence: 99%

Mutual-Modality Adversarial Attack with Semantic Perturbation

Ye,

Yu,

Liu

et al. 2024

AAAI

View full text Add to dashboard Cite

show abstract

“…Tao et al[55] also propose a method for generating unrestricted examples, referred to as EGM. As opposed to[60,16], it decouples the realistic image generation step from manipulating the artificial image into an adversarial example, thus having two components G and T in the adversarial attack generation pipeline.…”

mentioning

confidence: 99%

GASTeN: Generative Adversarial Stress Test Networks

Cunha

Soares

Restivo

et al. 2023

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Machine learning (ML) and deep learning (DL) are now ubiquitous in our society, and techniques that enable responsible usage are fundamental to safeguard people from being negatively affected. One particular example of DL's success is image classification. However, DL techniques function as black-box models whose knowledge representation is difficult to comprehend, and understanding the conditions under which they behave correctly is hard. Another example of a DL application is data generation, for which an algorithm known as GANs, mainly used for data augmentation, has achieved remarkable success. In GANs, two networks -the generator and the discriminator -are simultaneously trained. The generator learns to produce realistic data by trying to fool the discriminator, which is trained to distinguish between real and fake samples.This dissertation proposes a GAN-based approach for synthesizing new data to help understand DL image classifiers. We aim to generate examples that are hard for a given classifier that we could, ultimately, systematically analyze to get information about cases where the model's performance degrades. For that, we opt to generate data classified with low confidence by a classifier. Our approach, dubbed GASTeN, consists of modifying the loss function of the generator to include a new objective, dubbed confusion distance, which reflects how far the generated images are from having the desired output by the targetted classifier, i.e., the one we wish to evaluate. It introduces two hyperparameters, a weight to factor the new loss term, and the duration of pre-training of the GAN without any modifications.We empirically evaluate our proposal by instantiating it with a DCGAN architecture and a confusion distance suitable for binary classification. In our experiments, we target classifiers of binary subsets of the MNIST and Fashion MNIST datasets. We explore several hyperparameter configurations and target classifiers with different performances, analyzing the algorithm's behavior by collecting quantitative metrics for the two optimization objectives -FID for image realness and the average value of the confusion distance for the goal of confusing the classifier. In our experiments, we show scenarios in which we can obtain a generator with the desired properties of generating data with high realisticness that is mainly classified with low confidence by the target classifiers, along with scenarios where our goal is not attained. We conclude that, while challenging to optimize for both objectives simultaneously, it is possible to achieve images with the desired properties, albeit at the cost of carefully chosen hyperparameters.

show abstract