“…Many ADV's variants have been developed including but not limited to: (1) difference in the choice of adversarial examples, e.g., the worst-case examples (I. J. Goodfellow, Shlens, and Szegedy, 2015) or most divergent examples (Hongyang Zhang et al, 2019), (2) difference in the searching of adversarial examples, e.g., non-iterative FGSM, Rand FGSM with random initial point or PGD with multiple iterative gradient descent steps (Madry et al, 2018;Shafahi et al, 2019), (3) difference in additional regularizations, e.g., adding constraints in the latent space (Haichao Zhang and Wang, 2019;Bui et al, 2020), (4) difference in model architecture, e.g., activation function (Xie et al, 2020) or ensemble models (Pang, Xu, et al, 2019).…”