2022
DOI: 10.48550/arxiv.2201.12675
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Decepticons: Corrupted Transformers Breach Privacy in Federated Learning for Language Models

Abstract: A central tenet of Federated learning (FL), which trains models without centralizing user data, is privacy. However, previous work has shown that the gradient updates used in FL can leak user information. While the most industrial uses of FL are for text applications (e.g. keystroke prediction), nearly all attacks on FL privacy have focused on simple image classifiers. We propose a novel attack that reveals private user text by deploying malicious parameter vectors, and which succeeds even with mini-batches, m… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1

Citation Types

0
3
0

Year Published

2022
2022
2024
2024

Publication Types

Select...
2
1

Relationship

0
3

Authors

Journals

citations
Cited by 3 publications
(3 citation statements)
references
References 23 publications
0
3
0
Order By: Relevance
“…For example, an adversary that controls part of the training code can use the trained model as a side-channel to exfiltrate training data [3,61]. Or in federated learning, a malicious server can select model architectures that enable reconstructing training samples [9,20]. Alternatively, participants in decentralized learning protocols can boost privacy attacks by sending dynamic malicious updates [44,51,69].…”
Section: Attacks On Training Integritymentioning
confidence: 99%
“…For example, an adversary that controls part of the training code can use the trained model as a side-channel to exfiltrate training data [3,61]. Or in federated learning, a malicious server can select model architectures that enable reconstructing training samples [9,20]. Alternatively, participants in decentralized learning protocols can boost privacy attacks by sending dynamic malicious updates [44,51,69].…”
Section: Attacks On Training Integritymentioning
confidence: 99%
“…However, this comes at the cost of increased computational or communication costs between the clients and the server, and with increasing concerns about compromised user privacy. Privacy of user data is a growing concern, and standard federated averaging techniques are vulnerable to data leakage by inverting gradients into the data that generated them [1]- [5]. Gradients can be encrypted to preserve privacy, but incur further communication overhead [6].…”
Section: Introductionmentioning
confidence: 99%
“…This is particularly the case when participants are allowed to deviate from the predefined ML protocol (in a malicious adversary setting). When training a federated learning model, each potentially malicious participant can send false data on purpose [21] to prevent learning of the global model [22][23][24]. Furthermore, in an iterative procedure, any participant could compare the last global model with the previous state.…”
Section: Introductionmentioning
confidence: 99%