Reconstructing Training Data with Informed Adversaries

Balle, Borja; Cherubin, Giovanni; Hayes, John D.

doi:10.48550/arxiv.2201.04845

Cited by 3 publications

(4 citation statements)

References 45 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Overall, we expect future work can provide Bayes security-style guarantees for complex ML training pipelines. For example, by exploiting our results on the Gaussian mechanism (Section VI), it may be possible to study the security of DP-SGD against common attacks such as membership inference [31], attribute inference [20], and reconstruction [4], [7]. This will enable bypassing bounds relating ε and the advantage [22], [40], by computing the advantage (or Bayes security) directly.…”

Section: Discussionmentioning

confidence: 99%

“…One of the most widely used vulnerability metrics is Bayes vulnerability [32], defined as V (π) = max s π s = 1 − G(π); it expresses the adversary's probability of guessing the secret correctly in one try. 4 For the posterior version, it holds that V (π, C) = 1−R * (π, C). The multiplicative risk leakage follows the same core idea: G(π) can be thought of as a prior version of R * : indeed, it holds that R * (π, C) = o p(o)G(δ o ) where δ o are the posteriors of the channel.…”

Section: E Leakage Notions From Quantitative Information Flowmentioning

confidence: 99%

See 1 more Smart Citation

Bayes Security: A Not So Average Metric

Chatzikokolakis,

Cherubin,

Palamidessi

et al. 2023

2023 IEEE 36th Computer Security Foundations Symposium (CSF)

View full text Add to dashboard Cite

Security system designers favor worst-case security metrics, such as those derived from differential privacy (DP), due to the strong guarantees they provide. On the downside, these guarantees result in a high penalty on the system's performance. In this paper, we study Bayes security, a security metric inspired by the cryptographic advantage. Similarly to DP, Bayes security i) is independent of an adversary's prior knowledge, ii) it captures the worst-case scenario for the two most vulnerable secrets (e.g., data records); and iii) it is easy to compose, facilitating security analyses. Additionally, Bayes security iv) can be consistently estimated in a black-box manner, contrary to DP, which is useful when a formal analysis is not feasible; and v) provides a better utility-security trade-off in high-security regimes because it quantifies the risk for a specific threat model as opposed to threat-agnostic metrics such as DP. We formulate a theory around Bayes security, and we provide a thorough comparison with respect to well-known metrics, identifying the scenarios where Bayes Security is advantageous for designers.

show abstract

Section: Discussionmentioning

confidence: 99%

Section: E Leakage Notions From Quantitative Information Flowmentioning

confidence: 99%

Bayes Security: A Not So Average Metric

Chatzikokolakis,

Cherubin,

Palamidessi

et al. 2023

2023 IEEE 36th Computer Security Foundations Symposium (CSF)

View full text Add to dashboard Cite

show abstract

“…These works also recover "representative" images from different classes, rather than specific training examples. Recent work on reconstructing training images have used feature similarity (Haim et al, 2022) and pixel similarity (Balle et al, 2022). In each of these papers, "fuzzy" reconstructions are allowed by the evaluation metrics and, indeed, are common in their reconstructions.…”

Section: Discussionmentioning

confidence: 99%

Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy

Ippolito¹,

Tramèr²,

Nasr³

et al. 2022

Preprint

View full text Add to dashboard Cite

Studying data memorization in neural language models helps us understand the risks (e.g., to privacy or copyright) associated with models regurgitating training data, and aids in the evaluation of potential countermeasures. Many prior works-and some recently deployed defenses-focus on "verbatim memorization", defined as a model generation that exactly matches a substring from the training set. We argue that verbatim memorization definitions are too restrictive and fail to capture more subtle forms of memorization. Specifically, we design and implement an efficient defense based on Bloom filters that perfectly prevents all verbatim memorization. And yet, we demonstrate that this "perfect" filter does not prevent the leakage of training data. Indeed, it is easily circumvented by plausible and minimally modified "style-transfer" prompts-and in some cases even the non-modified original prompts-to extract memorized information. For example, instructing the model to output ALL-CAPITAL texts bypasses memorization checks based on verbatim matching. We conclude by discussing potential alternative definitions and why defining memorization is a difficult yet crucial open question for neural language models.˚Remaining authors ordered by Algorithm 15 in Appendix I; briefly, we require Daphne be listed first, and Nicholas listed last, and we search for the first permutation of authors' first names which satisfies these constraints, where permutations order names by their salted MD5 hash.

show abstract

“…For example, membership inference attacks (MIA) attempt to distinguish whether a sample was present in the training set given only the trained model (Shokri et al, 2017;Sablayrolles et al, 2019). Others attacks consider the more difficult problem of reconstructing entire training samples from a trained model, often using (batch) gradient information Jeon et al, 2021;Balle et al, 2022). Since model updates u i are essentially just aggregated gradients, it is natural that FL updates may leak private information as well.…”

Section: Attacks and Empirical Privacymentioning

confidence: 99%

CANIFE: Crafting Canaries for Empirical Privacy Measurement in Federated Learning

Maddock¹,

Sablayrolles²,

Stock³

2022

Preprint

View full text Add to dashboard Cite

Federated Learning (FL) is a setting for training machine learning models in distributed environments where the clients do not share their raw data but instead send model updates to a server. However, model updates can be subject to attacks and leak private information. Differential Privacy (DP) is a leading mitigation strategy which involves adding noise to clipped model updates, trading off performance for strong theoretical privacy guarantees. Previous work has shown that the threat model of DP is conservative and that the obtained guarantees may be vacuous or may not directly translate to information leakage in practice. In this paper, we aim to achieve a tighter measurement of the model exposure by considering a realistic threat model. We propose a novel method, CANIFE, that uses canaries-carefully crafted samples by a strong adversary to evaluate the empirical privacy of a training round. We apply this attack to vision models trained on CIFAR-10 and CelebA and to language models trained on Sent140 and Shakespeare. In particular, in realistic FL scenarios, we demonstrate that the empirical epsilon obtained with CANIFE is 2-7× lower than the theoretical bound. * Work done during an internship at Meta.

show abstract

Reconstructing Training Data with Informed Adversaries

Cited by 3 publications

References 45 publications

Bayes Security: A Not So Average Metric

Bayes Security: A Not So Average Metric

Preventing Verbatim Memorization in Language Models Gives a False Sense of Privacy

CANIFE: Crafting Canaries for Empirical Privacy Measurement in Federated Learning

Contact Info

Product

Resources

About