Deep neural network image classifiers are known to be susceptible, not only to adversarial examples created for them, but also to those created for others. This phenomenon poses a potential security risk in various black-box systems that rely on image classifiers. One of the observations on networks that have transferability of adversarial examples between them is the similarity of their architectures. Networks with high architectural similarity tend to share high transferability as well. Thus, in this study, we address this problem from a novel perspective by investigating the contribution of network architecture to transferability. Specifically, we propose an architecture searching framework that employs neuroevolution to evolve network architectures and gradient misalignment loss to encourage networks to converge into dissimilar functions after training. Our findings indicate that the proposed framework successfully discovers architectures that reduce transferability from four standard networks, including ResNet and VGG, while maintaining good accuracy on unperturbed images. In addition, the evolved networks trained with gradient misalignment exhibit significantly lower transferability than a standard network trained with gradient misalignment, which indicates that network architecture plays an important role in reducing transferability. We demonstrate that designing or exploring proper network architectures is a promising approach to tackle the transferability issue and train adversarially robust image classifiers.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.