Nowadays, deep learning is becoming increasingly important in our daily life. The appearance of deep learning in many applications in life relates to prediction and classification such as self-driving, product recommendation, advertisements and healthcare. Therefore, if a deep learning model causes false predictions and misclassification, it can do great harm. This is basically a crucial issue in the deep learning model. In addition, deep learning models use large amounts of data in the training/learning phases, which contain sensitive information. Therefore, when deep learning models are used in real-world applications, it is required to protect the privacy information used in the model. In this article, we carry out a brief review of the threats and defenses methods on security issues for the deep learning models and the privacy of the data used in such models while maintaining their performance and accuracy. Finally, we discuss current challenges and future developments.
Purpose
In the digital age, organizations want to build a more powerful machine learning model that can serve the increasing needs of people. However, enhancing privacy and data security is one of the challenges for machine learning models, especially in federated learning. Parties want to collaborate with each other to build a better model, but they do not want to reveal their own data. This study aims to introduce threats and defenses to privacy leaks in the collaborative learning model.
Design/methodology/approach
In the collaborative model, the attacker was the central server or a participant. In this study, the attacker is on the side of the participant, who is “honest but curious.” Attack experiments are on the participant’s side, who performs two tasks: one is to train the collaborative learning model; the second task is to build a generative adversarial networks (GANs) model, which will perform the attack to infer more information received from the central server. There are three typical types of attacks: white box, black box without auxiliary information and black box with auxiliary information. The experimental environment is set up by PyTorch on Google Colab platform running on graphics processing unit with labeled faces in the wild and Canadian Institute For Advanced Research-10 data sets.
Findings
The paper assumes that the privacy leakage attack resides on the participant’s side, and the information in the parameter server contains too much knowledge to train a collaborative machine learning model. This study compares the success level of inference attack from model parameters based on GAN models. There are three GAN models, which are used in this method: condition GAN, control GAN and Wasserstein generative adversarial networks (WGAN). Of these three models, the WGAN model has proven to obtain the highest stability.
Originality/value
The concern about privacy and security for machine learning models are more important, especially for collaborative learning. The paper has contributed experimentally to private attack on the participant side in the collaborative learning model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.