2019
DOI: 10.48550/arxiv.1904.02405
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

White-to-Black: Efficient Distillation of Black-Box Adversarial Attacks

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
6
0

Year Published

2020
2020
2024
2024

Publication Types

Select...
4
2

Relationship

0
6

Authors

Journals

citations
Cited by 6 publications
(6 citation statements)
references
References 9 publications
0
6
0
Order By: Relevance
“…Researchers also studied adversarial attacks targeting Perspective API [8,[14][15][16]. However, as the version iterates, Perspective API developed defensive strategies to thwart these machine-generated attacks.…”
Section: Toxic Speech Detectionmentioning
confidence: 99%
“…Researchers also studied adversarial attacks targeting Perspective API [8,[14][15][16]. However, as the version iterates, Perspective API developed defensive strategies to thwart these machine-generated attacks.…”
Section: Toxic Speech Detectionmentioning
confidence: 99%
“…Adversarial attack There have been a lot of works showing the vulnerability of NLP models under adversarial examples (Li et al, 2020c;Garg and Ramakrishnan, 2020;Zang et al, 2020), which are understandable by humans yet lead to significant model prediction drops. There are usually two types of attacks: (1) semantic equivalent replacement, which can be synthesized by replacing words based on vector similarity (Jin et al, 2020;, WordNet synonyms (Zang et al, 2020), masked prediction from pretrained models (Li et al, 2020c;Garg and Ramakrishnan, 2020;Li et al, 2020d), etc. (2) noise injection, which can be synthesized by adding/deleting/swapping words (Li et al, 2019a;Gil et al, 2019;, replacing words with phonetically or visually similar ones (Eger et al, 2019;Eger and Benz, 2020). For logographic languages like Chinese, the noise can be much more complex as it can be injected on both the glyph characters or romanized pinyins Nuo et al, 2020).…”
Section: Related Workmentioning
confidence: 99%
“…Knowledge distillation has also been used for adversarial attacks (Papernot et al, 2016b;Ross & Doshi-Velez, 2017;Gil et al, 2019;Goldblum et al, 2020), data security (Papernot et al, 2016a;Lopes et al, 2017;, image processing (Li & Hoiem, 2017;Wang et al, 2017;Chen et al, 2018;, natural language processing (Nakashole & Flauger, 2017;Mou et al, 2016;Hu et al, 2018;Freitag et al, 2017), and speech processing (Chebotar & Waters, 2016;Lu et al, 2017;Watanabe et al, 2017;Oord et al, 2018;Shen et al, 2018).…”
Section: A Extended Literature Reviewmentioning
confidence: 99%