Despite the excellent classification performance, recent research has revealed that the Convolutional Neural Network (CNN) could be readily deceived by only the small adversarial perturbation. Its imperceptible to human eyes and transferability from one model to another actually threaten the security of a CNN-based system. In this paper, we propose to create multiple and independent random binary codes per input class and train ensemble of homogeneous CNN classifiers with these codes to improve the adversarial robustness of the networks. The proposed ensemble structure consists of replicas of the same learning architecture, but each network is trained with different random target outputs. The network model is simultaneously trained with their own unique binary codes, and optimized through a single and common objective function in an end-to-end manner. It is demonstrated with experimental results that assigning different encoded labels for each classifier in ensemble leverages the diversity and eventually improves the classification performance on adversarial attacks. We also conduct several performance analysis to understand how the different aspects can contribute to the robustness of the proposed algorithm. The proposed algorithm provides significantly improved classification accuracies as compared to the recent relevant studies, verified with various network architectures, datasets, and adversarial attacks.
INDEX TERMSDeep learning, convolutional neural network, adversarial attack, image classification, output encoding, ensemble.