2022 IEEE Symposium on Security and Privacy (SP) 2022
DOI: 10.1109/sp46214.2022.9833649
|View full text |Cite
|
Sign up to set email alerts
|

Membership Inference Attacks From First Principles

Abstract: We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like Ope-nAI's ChatGPT or Google's PaLM-2. Specifically, our attack recovers the embedding projection layer (up to symmetries) of a transformer model, given typical API access. For under $20 USD, our attack extracts the entire projection matrix of OpenAI's ada and babbage language models. We thereby confirm, for the first time, that these black-box models have a hidden dimension … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1
1

Citation Types

3
208
0
2

Year Published

2022
2022
2023
2023

Publication Types

Select...
4
1
1

Relationship

0
6

Authors

Journals

citations
Cited by 208 publications
(213 citation statements)
references
References 38 publications
3
208
0
2
Order By: Relevance
“…The conclusions are further analytically and empirically generalized to non-linear feature extractors. Then, we empirically validate that models trained on DC-synthesized data are robust to both vanilla loss-based MIA and the state-of-the-art likelihood-based MIA (Carlini et al, 2022). Finally, we study the visual privacy of DC-synthesized data in case of adversary's direct matching attack.…”
Section: Introductionmentioning
confidence: 87%
See 4 more Smart Citations
“…The conclusions are further analytically and empirically generalized to non-linear feature extractors. Then, we empirically validate that models trained on DC-synthesized data are robust to both vanilla loss-based MIA and the state-of-the-art likelihood-based MIA (Carlini et al, 2022). Finally, we study the visual privacy of DC-synthesized data in case of adversary's direct matching attack.…”
Section: Introductionmentioning
confidence: 87%
“…For LiRA (Carlini et al, 2022), we repeat the preparation of synthetic dataset N m times with different random seeds, and obtain N m shadow T , S and f S . We set N m = 256 for DM and N m = 64 for KIP because of its lower training efficiency.…”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations