Adversarial and Clean Data Are Not Twins

Gong, Zhongying; Wang, Wenlu; Ku, Wei‐Shinn

doi:10.48550/arxiv.1704.04960

Cited by 76 publications

(123 citation statements)

References 0 publications

Supporting

Mentioning

121

Contrasting

Order By: Relevance

“…The generator produces the adversarial examples to deceive both the discriminator and classifier; the discriminator and classifier attempt to differentiate the adversaries from clean data and produce the correct labels respectively. Some adversary detector networks are proposed to detect the adversarial examples which can be well aligned with our method (Gong et al, 2017;Grosse et al, 2017). In these works, a pretrained network is augmented with a binary detector network.…”

Section: Related Workmentioning

confidence: 99%

Improving Model Robustness with Latent Distribution Locally and Globally

Zhuang¹,

Zhang²,

Huang³

et al. 2021

Preprint

View full text Add to dashboard Cite

In this work, we consider model robustness of deep neural networks against adversarial attacks from a global manifold perspective. Leveraging both the local and global latent information, we propose a novel adversarial training method through robust optimization, and a tractable way to generate Latent Manifold Adversarial Examples (LMAEs) via an adversarial game between a discriminator and a classifier. The proposed adversarial training with latent distribution (ATLD) method defends against adversarial attacks by crafting LMAEs with the latent manifold in an unsupervised manner. ATLD preserves the local and global information of latent manifold and promises improved robustness against adversarial attacks. To verify the effectiveness of our proposed method, we conduct extensive experiments over different datasets (e.g., CIFAR-10, CIFAR-100, SVHN) with different adversarial attacks (e.g., PGD, CW), and show that our method substantially outperforms the state-of-the-art (e.g., Feature Scattering) in adversarial robustness by a large accuracy margin. The source codes are available at https://github.com/LitterQ/ATLD-pytorch.

show abstract

Section: Related Workmentioning

confidence: 99%

Improving Model Robustness with Latent Distribution Locally and Globally

Zhuang¹,

Zhang²,

Huang³

et al. 2021

Preprint

View full text Add to dashboard Cite

show abstract

“…A straightforward way towards adversarial example detection is to build a simple binary classifier separating the adversarial examples apart from the clean data (Gong et al, 2017). The advantage is that it serves as a preprocessing step without imposing any assumptions on the model it protects.…”

Section: Detectionmentioning

confidence: 99%

A Review of Adversarial Attack and Defense for Classification Methods

Li,

Cheng,

Hsieh

et al. 2021

Preprint

View full text Add to dashboard Cite

Despite the efficiency and scalability of machine learning systems, recent studies have demonstrated that many classification methods, especially deep neural networks (DNNs), are vulnerable to adversarial examples; i.e., examples that are carefully crafted to fool a well-trained classification model while being indistinguishable from natural data to human. This makes it potentially unsafe to apply DNNs or related methods in security-critical areas. Since this issue was first identified by Biggio et al. (2013) and Szegedy et al. (2014), much work has been done in this field, including the development of attack methods to generate adversarial examples and the construction of defense techniques to guard against such examples. This paper aims to introduce this topic and its latest developments to the statistical community, primarily focusing on the generation and guarding of adversarial examples. Computing codes (in python and R) used in the numerical experiments are publicly available for readers to explore the surveyed methods. It is the hope of the authors that this paper will encourage more statisticians to work on this important and exciting field of generating and defending against adversarial examples.

show abstract

“…Separate Classifier Or Statistical Tests. The earlier approaches to detect adversarial examples used a separately-trained classifier [14,17,30] or statistical properties [12,17,19]. However, many of these approaches were subsequently shown to be weak [1,5].…”

Section: Related Workmentioning

confidence: 99%

Two Souls in an Adversarial Image: Towards Universal Adversarial Example Detection using Multi-view Inconsistency

Kiani

Awan

Lan

et al. 2021

Annual Computer Security Applications Conference

View full text Add to dashboard Cite

In the evasion attacks against deep neural networks (DNN), the attacker generates adversarial instances that are visually indistinguishable from benign samples and sends them to the target DNN to trigger misclassifications. In this paper, we propose a novel multiview adversarial image detector, namely Argos, based on a novel observation. That is, there exist two "souls" in an adversarial instance, i.e., the visually unchanged content, which corresponds to the true label, and the added invisible perturbation, which corresponds to the misclassified label. Such inconsistencies could be further amplified through an autoregressive generative approach that generates images with seed pixels selected from the original image, a selected label, and pixel distributions learned from the training data. The generated images (i.e., the "views") will deviate significantly from the original one if the label is adversarial, demonstrating inconsistencies that Argos expects to detect. To this end, Argos first amplifies the discrepancies between the visual content of an image and its misclassified label induced by the attack using a set of regeneration mechanisms and then identifies an image as adversarial if the reproduced views deviate to a preset degree. Our experimental results show that Argos significantly outperforms two representative adversarial detectors in both detection accuracy and robustness against six well-known adversarial attacks. Code is available at: https://github.com/sohaib730/Argos-Adversarial_Detection CCS CONCEPTS• Security and privacy; • Computing methodologies → Artificial intelligence; Machine learning;

show abstract

Adversarial and Clean Data Are Not Twins

Cited by 76 publications

References 0 publications

Improving Model Robustness with Latent Distribution Locally and Globally

Improving Model Robustness with Latent Distribution Locally and Globally

A Review of Adversarial Attack and Defense for Classification Methods

Two Souls in an Adversarial Image: Towards Universal Adversarial Example Detection using Multi-view Inconsistency

Contact Info

Product

Resources

About