Inference attacks based on GAN in federated learning

Ha, Trung; Dang, Tran Khanh

doi:10.1108/ijwis-04-2022-0078

Cited by 7 publications

(5 citation statements)

References 35 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Insufficient knowledge exists in the parameter server to properly train a collaborative machine learning model. A previous study Ha et al (2022) assumed that the participant was the one who was subject to the privacy leakage attack, and compared the success rates of inference attacks from model parameters using GANs models.…”

Section: Inference Attacksmentioning

confidence: 99%

Anomaly detection and defense techniques in federated learning: a comprehensive review

Zhang,

Yang,

Mao

et al. 2024

Artif Intell Rev

View full text Add to dashboard Cite

In recent years, deep learning methods based on a large amount of data have achieved substantial success in numerous fields. However, with increases in regulations for protecting private user data, access to such data has become restricted. To overcome this limitation, federated learning (FL) has been widely utilized for training deep learning models without centralizing data. However, the inaccessibility of FL data and heterogeneity of the client data render difficulty in providing security and protecting the privacy in FL. In addition, the security and privacy anomalies in the corresponding systems significantly hinder the application of FL. Numerous studies have been proposed aiming to maintain the model security and mitigate the leakage of private training data during the FL training phase. Existing surveys categorize FL attacks from a defensive standpoint, but lack the efficiency of pinpointing attack points and implementing timely defenses. In contrast, our survey comprehensively categorizes and summarizes detected anomalies across client, server, and communication perspectives, facilitating easier identification and timely defense measures. Our survey provides an overview of the FL system and briefly introduces the FL security and privacy anomalies. Next, we detail the existing security and privacy anomalies and the methods of detection and defense from the perspectives of the client, server, and communication process. Finally, we address the security and privacy anomalies in non-independent identically distributed cases during FL and summarize the related research progress. This survey aims to provide a systematic and comprehensive review of security and privacy research in FL to help understand the progress and better apply FL in additional scenarios.

show abstract

Section: Inference Attacksmentioning

confidence: 99%

Anomaly detection and defense techniques in federated learning: a comprehensive review

Zhang,

Yang,

Mao

et al. 2024

Artif Intell Rev

View full text Add to dashboard Cite

show abstract

“…Melis et al [16] used user-updated model parameters as features for training attack model inputs for inferring relevant attributes of other user datasets. Te literature [7,50,51] employs generative adversarial networks to generate methods for recovering training data from other users, and Mahendran et al [22] investigate gradient inversion information maximization to synthesize real data from training networks, but both rely on a priori information from auxiliary datasets. Mordvintsev et al [23] use only the gradients in the input to enable the separation of noise and image, making it difcult to obtain higher-fdelity information on large datasets.…”

Section: Gradient Update-based Data Leakagementioning

confidence: 99%

“…We can still prove theoretically that the recovery of data labels is independent of the iterative training process of the network and refer to the proof of (equation ( 3)) for details. Tis 􏼈 􏼉 ≤ 0//Get the subscript c ′ of the last layer of negative bias (2) c pred ← One_Hot (c ′ )//Convert c ′ to One_Hot code (3) x ′ ←N(0, 1)//Initialize virtual data with the same dimensions as x (4) for i←1 to N do (5) ∇W ′ ←zLoss(F(x ′ , W), c pred )/zW//Calculating virtual gradients (6) Loss W d ←WDCA(∇W ′ , ∇W)//WDCA is Algorithm 2 (7) x ′ ←x ′ − η∇ x ′ Loss W d (8) end for Output: x ′ ALGORITHM 1: WDLG Algorithm.…”

Section: Comparison Of the Accuracy Of Predicted Labelsmentioning

confidence: 99%

“…FL builds iterable aggregated models by training distributed models across multiple data sources with local data, only exchanging model parameters or intermediate results models, and thus learning a shared target model. Improved approaches based on FL methods [2][3][4][5][6][7][8][9] have carried out a lot of research work in achieving a balance between data privacy protection and data sharing computation. Currently, more researchers are using cryptographic privacy-preserving methods and diferential privacy-preserving methods to achieve privacy-preserving security for local gradients in the federal learning security problem [10,11].…”

Section: Introductionmentioning

confidence: 99%

See 1 more Smart Citation

Fast and Accurate Deep Leakage from Gradients Based on Wasserstein Distance

Peng

Tan

2023

International Journal of Intelligent Systems

View full text Add to dashboard Cite

Shared gradients are widely used to protect the private information of training data in distributed machine learning systems. However, Deep Leakage from Gradients (DLG) research has found that private training data can be recovered from shared gradients. The DLG method still has some issues such as the “Exploding Gradient,” low attack success rate, and low fidelity of recovered data. In this study, a Wasserstein DLG method, named WDLG, is proposed; the theoretical analysis shows that under the premise that the output layer of the model has a “bias” term, predicting the “label” of the data by whether the “bias” is “negative” or not is independent of the approximation of the shared gradient, and thus, the label of the data can be recovered with 100% accuracy. In the proposed method, the Wasserstein distance is used to calculate the error loss between the shared gradient and the virtual gradient, which improves model training stability, solves the “Exploding Gradient” phenomenon, and improves the fidelity of the recovered data. Moreover, a large learning rate strategy is designed to improve model training convergence speed in-depth. Finally, the WDLG method is validated on datasets from MNIST, Fashion MNIST, SVHN, CIFAR-100, and LFW. Experiments results show that the proposed WDLG method provides more stable updates for virtual data, a higher attack success rate, faster model convergence, higher image fidelity during recovery, and support for designing large learning rate strategies.

show abstract

“…According to content and scenarios, the data involved in AI applications can be divided into three categories: original data generated by users and identity data, data reflecting the appearance of the users’ behavior collected through users’ daily life behaviors, network records and App records and characteristic index data obtained from algorithms. These data bring immeasurable business value to enterprises and efficient and convenient services to people, but they may have the potential to compromise sensitive and private data in the process of flow (Guo et al , 2021; Ha and Dang, 2022). In this regard, the academic community has conducted several targeted studies.…”

Section: Introductionmentioning

confidence: 99%

FedACQ: adaptive clustering quantization of model parameters in federated learning

Tian,

Shi,

et al. 2023

IJWIS

View full text Add to dashboard Cite

Purpose For privacy protection, federated learning based on data separation allows machine learning models to be trained on remote devices or in isolated data devices. However, due to the limited resources such as bandwidth and power of local devices, communication in federated learning can be much slower than in local computing. This study aims to improve communication efficiency by reducing the number of communication rounds and the size of information transmitted in each round. Design/methodology/approach This paper allows each user node to perform multiple local trainings, then upload the local model parameters to a central server. The central server updates the global model parameters by weighted averaging the parameter information. Based on this aggregation, user nodes first cluster the parameter information to be uploaded and then replace each value with the mean value of its cluster. Considering the asymmetry of the federated learning framework, adaptively select the optimal number of clusters required to compress the model information. Findings While maintaining the loss convergence rate similar to that of federated averaging, the test accuracy did not decrease significantly. Originality/value By compressing uplink traffic, the work can improve communication efficiency on dynamic networks with limited resources.

show abstract

Inference attacks based on GAN in federated learning

Cited by 7 publications

References 35 publications

Anomaly detection and defense techniques in federated learning: a comprehensive review

Anomaly detection and defense techniques in federated learning: a comprehensive review

Fast and Accurate Deep Leakage from Gradients Based on Wasserstein Distance

FedACQ: adaptive clustering quantization of model parameters in federated learning

Contact Info

Product

Resources

About