Following the success in advancing natural language processing and understanding, transformers are expected to bring revolutionary changes to computer vision. This work provides the first and comprehensive study on the robustness of vision transformers (ViTs) against adversarial perturbations. Tested on various white-box and transfer attack settings, we find that ViTs possess better adversarial robustness when compared with convolutional neural networks (CNNs). We summarize the following main observations contributing to the improved robustness of ViTs: 1) Features learned by ViTs contain less low-level information and are more generalizable, which contributes to superior robustness against adversarial perturbations. 2) Introducing convolutional or tokens-to-token blocks for learning low-level features in ViTs can improve classification accuracy but at the cost of adversarial robustness. 3) Increasing the proportion of transformers in the model structure (when the model consists of both transformer and CNN blocks) leads to better robustness. But for a pure transformer model, simply increasing the size or adding layers cannot guarantee a similar effect. 4) Pre-training on larger datasets does not significantly improve adversarial robustness though it is critical for training ViTs. 5) Adversarial training is also applicable to ViT for training robust models. Furthermore, feature visualization and frequency analysis are conducted for explanation. The results show that ViTs are less sensitive to high-frequency perturbations than CNNs and there is a high correlation between how well the model learns lowlevel features and its robustness against different frequencybased perturbations.
CAPTCHA (Completely Automated Public Truing test to tell Computers and Humans Apart) is a widely used technology to distinguish real users and automated users such as bots. However, the advance of AI technologies weakens many CAPTCHA tests and can induce security concerns. In this paper, we propose a user-friendly textbased CAPTCHA generation method named Robust Text CAPTCHA (RTC). At the first stage, the foregrounds and backgrounds are constructed with randomly sampled font and background images, which are then synthesized into identifiable pseudo adversarial CAPTCHAs. At the second stage, we design and apply a highly transferable adversarial attack for text CAPTCHAs to better obstruct CAPTCHA solvers. Our experiments cover comprehensive models including shallow models such as KNN, SVM and random forest, various deep neural networks and OCR models. Experiments show that our CAPTCHAs have a failure rate lower than one millionth in general and high usability. They are also robust against various defensive techniques that attackers may employ, including adversarial training, data pre-processing and manual tagging.
BACKGROUND Artificial neural network has achieved unprecedented success in a wide variety of domains such as classifying, predicting and recognizing objects. This success depends on the availability of massive and representative datasets. However, data collection is often prevented by privacy concerns and people want to take control over their sensitive information during both training and using processes. OBJECTIVE To address this problem, we propose a privacy-preserving method for the distributed system. The proposed method, Stochastic Channel-Based Federated Learning (SCBF), enables the participants to train a high-performance model cooperatively without sharing their inputs. METHODS Specifically, we design, implement and evaluate a channel-based update algorithm for the central server in a distributed system. The update algorithm will select the channels with regard to the most active features in a training loop and upload them as learned information from local datasets. A pruning process, which serves as a model accelerator, is applied to the algorithm based on the validation set. RESULTS We construct a distributed system consisting of 5 clients and 1 server. Our trials show that the Stochastic Channel-Based Federated Learning method can achieve an AUCROC of 0.9776 and an AUCPR of 0.9695 with 10% channels shared with the server. Compared with Federated Averaging algorithm, the proposed method achieves 0.05388 higher in AUCROC and 0.09695 higher in AUCPR. In addition, our experiment shows that 57% of the time is saved by the pruning process with only a reduction of 0.0047 in AUCROC performance and a reduction of 0.0068 in AUCPR. CONCLUSIONS In the experiment, our model presents better performances and higher saturating speed than the Federated Averaging method, which reveals all the parameters of local models to the server. We also demonstrate that the saturating rate of performance could be promoted by introducing a pruning process and further improvement could be achieved by tuning the pruning rate.
Background Artificial neural networks have achieved unprecedented success in the medical domain. This success depends on the availability of massive and representative datasets. However, data collection is often prevented by privacy concerns, and people want to take control over their sensitive information during both the training and using processes. Objective To address security and privacy issues, we propose a privacy-preserving method for the analysis of distributed medical data. The proposed method, termed stochastic channel-based federated learning (SCBFL), enables participants to train a high-performance model cooperatively and in a distributed manner without sharing their inputs. Methods We designed, implemented, and evaluated a channel-based update algorithm for a central server in a distributed system. The update algorithm will select the channels with regard to the most active features in a training loop, and then upload them as learned information from local datasets. A pruning process, which serves as a model accelerator, was further applied to the algorithm based on the validation set. Results We constructed a distributed system consisting of 5 clients and 1 server. Our trials showed that the SCBFL method can achieve an area under the receiver operating characteristic curve (AUC-ROC) of 0.9776 and an area under the precision-recall curve (AUC-PR) of 0.9695 with only 10% of channels shared with the server. Compared with the federated averaging algorithm, the proposed SCBFL method achieved a 0.05388 higher AUC-ROC and 0.09695 higher AUC-PR. In addition, our experiment showed that 57% of the time is saved by the pruning process with only a reduction of 0.0047 in AUC-ROC performance and a reduction of 0.0068 in AUC-PR performance. Conclusions In this experiment, our model demonstrated better performance and a higher saturating speed than the federated averaging method, which reveals all of the parameters of local models to the server. The saturation rate of performance could be promoted by introducing a pruning process and further improvement could be achieved by tuning the pruning rate.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.