AutoZOOM: Autoencoder-Based Zeroth Order Optimization Method for Attacking Black-Box Neural Networks

Tu, Chun‐Chen; Ting, Pei‐Yih; Chen, Pin-Yu; Liu, Sijia; Zhang, Huan; Yi, Jinfeng; Hsieh, Cho‐Jui; Cheng, Shin‐Ming

doi:10.1609/aaai.v33i01.3301742

Cited by 327 publications

(280 citation statements)

References 13 publications

Supporting

Mentioning

279

Contrasting

Unclassified

Order By: Relevance

“…In addition to the aforementioned works, there are also other black-box attacks [20,21,22,23] under different practical settings, which are explored very recently. Among those, the notable boundary method [20] implements a decisionbased attack, which starts from a very large adversarial perturbation (thus causing an immediate misclassification) and tries to reduce the perturbation (i.e., minimize the distortion) through a random walk while remaining adversarial via staying on the boundary between the misclassified class and the true class.…”

Section: Other Black-box Attacksmentioning

confidence: 99%

On the Design of Black-Box Adversarial Examples by Leveraging Gradient-Free Optimization and Operator Splitting Method

Zhao

Liu

Chen

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

Self Cite

View full text Add to dashboard Cite

Robust machine learning is currently one of the most prominent topics which could potentially help shaping a future of advanced AI platforms that not only perform well in average cases but also in worst cases or adverse situations. Despite the long-term vision, however, existing studies on black-box adversarial attacks are still restricted to very specific settings of threat models (e.g., single distortion metric and restrictive assumption on target model's feedback to queries) and/or suffer from prohibitively high query complexity. To push for further advances in this field, we introduce a general framework based on an operator splitting method, the alternating direction method of multipliers (ADMM) to devise efficient, robust black-box attacks that work with various distortion metrics and feedback settings without incurring high query complexity. Due to the black-box nature of the threat model, the proposed ADMM solution framework is integrated with zeroth-order (ZO) optimization and Bayesian optimization (BO), and thus is applicable to the gradient-free regime. This results in two new black-box adversarial attack generation methods, ZO-ADMM and BO-ADMM. Our empirical evaluations on image classification datasets show that our proposed approaches have much lower function query complexities compared to state-of-the-art attack methods, but achieve very competitive attack success rates.

show abstract

Section: Other Black-box Attacksmentioning

confidence: 99%

On the Design of Black-Box Adversarial Examples by Leveraging Gradient-Free Optimization and Operator Splitting Method

Zhao

Liu

Chen

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

Self Cite

View full text Add to dashboard Cite

show abstract

“…It should be noted that recent research on black-box attacks has largely focused on classifiers that provide confidence scores, which is an easier setting. Nevertheless, many of these methods also use random sampling [6,21,10], and the biases we propose could also benefit their approaches. As an aside, Ilyas et al [10] propose a variation of their attack that manages to apply gradient estimation to discrete labels.…”

Section: Sampling-basedmentioning

confidence: 99%

Guessing Smart: Biased Sampling for Efficient Black-Box Adversarial Attacks

Brunner

Diehl

et al. 2019

2019 IEEE/CVF International Conference on Computer Vision (ICCV)

View full text Add to dashboard Cite

We consider adversarial examples for image classification in the black-box decision-based setting. Here, an attacker cannot access confidence scores, but only the final label. Most attacks for this scenario are either unreliable or inefficient. Focusing on the latter, we show that a specific class of attacks, Boundary Attacks, can be reinterpreted as a biased sampling framework that gains efficiency from domain knowledge. We identify three such biases, image frequency, regional masks and surrogate gradients, and evaluate their performance against an ImageNet classifier. We show that the combination of these biases outperforms the state of the art by a wide margin. We also showcase an efficient way to attack the Google Cloud Vision API, where we craft convincing perturbations with just a few hundred queries. Finally, the methods we propose have also been found to work very well against strong defenses: Our targeted attack won second place in the NeurIPS 2018 Adversarial Vision Challenge.

show abstract

“…Finite-difference methods, FDM, which are also known as zeroorder optimization methods, directly estimate gradientsĜ dnn ′ (x) for a target API ′ by making repeated queries around x [7,21,22,41] and recording minute differences in the returned values. The baseline assumption is that API ′ returns maximum information (API ′ I ).…”

Section: Adversarial Example Definitionsmentioning

confidence: 99%

Making Targeted Black-box Evasion Attacks Effective and Efficient

Juuti

Atli

Asokan

2019

Proceedings of the 12th ACM Workshop on Artificial Intelligence and Security

View full text Add to dashboard Cite

We investigate how an adversary can optimally use its query budget for targeted evasion attacks against deep neural networks in a blackbox setting. We formalize the problem setting and systematically evaluate what benefits the adversary can gain by using substitute models. We show that there is an exploration-exploitation tradeoff in that query efficiency comes at the cost of effectiveness. We present two new attack strategies for using substitute models and show that they are as effective as previous "query-only" techniques but require significantly fewer queries, by up to three orders of magnitude. We also show that an agile adversary capable of switching through different attack techniques can achieve pareto-optimal efficiency. We demonstrate our attack against Google Cloud Vision showing that the difficulty of black-box attacks against real-world prediction APIs is significantly easier than previously thought (requiring ≈500 queries instead of ≈20,000 as in previous works).

show abstract

AutoZOOM: Autoencoder-Based Zeroth Order Optimization Method for Attacking Black-Box Neural Networks

Cited by 327 publications

References 13 publications

On the Design of Black-Box Adversarial Examples by Leveraging Gradient-Free Optimization and Operator Splitting Method

On the Design of Black-Box Adversarial Examples by Leveraging Gradient-Free Optimization and Operator Splitting Method

Guessing Smart: Biased Sampling for Efficient Black-Box Adversarial Attacks

Making Targeted Black-box Evasion Attacks Effective and Efficient

Contact Info

Product

Resources

About