2018
DOI: 10.48550/arxiv.1802.04889
|View full text |Cite
Preprint
|
Sign up to set email alerts
|

Understanding Membership Inferences on Well-Generalized Learning Models

Abstract: Membership Inference Attack (MIA) determines the presence of a record in a machine learning model's training data by querying the model. Prior work has shown that the attack is feasible when the model is overfitted to its training data or when the adversary controls the training algorithm. However, when the model is not overfitted and the adversary does not control the training algorithm, the threat is not well understood. In this paper, we report a study that discovers overfitting to be a sufficient but not a… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
2
1
1

Citation Types

0
75
0

Year Published

2019
2019
2022
2022

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 62 publications
(75 citation statements)
references
References 32 publications
(44 reference statements)
0
75
0
Order By: Relevance
“…We argue that a simple modification to the score, which we call difficulty calibration, can drastically improve the attack's reliability. This approach has been applied to the loss score for high-precision attack against well-generalized models (Long et al, 2018). Let s(h, (x, y)) be the membership score, where higher score indicates a stronger signal that the sample (x, y) is a member.…”
Section: Difficulty Calibrationmentioning
confidence: 99%
See 1 more Smart Citation
“…We argue that a simple modification to the score, which we call difficulty calibration, can drastically improve the attack's reliability. This approach has been applied to the loss score for high-precision attack against well-generalized models (Long et al, 2018). Let s(h, (x, y)) be the membership score, where higher score indicates a stronger signal that the sample (x, y) is a member.…”
Section: Difficulty Calibrationmentioning
confidence: 99%
“…Various forms of difficulty calibration have been considered in the context of highprecision attacks. Long et al (2018) selected samples that differ the most in loss between the target and a set of reference models, and showed that the resulting attack has high precision even for well-generalized target models. Carlini et al (2020) showed that such privacy attacks are also possible on large-scale language models such as GPT-2 (Radford et al, 2019).…”
Section: Introductionmentioning
confidence: 99%
“…A substantial body of literature followed this work extending the attacks to different setting such as white box analysis [Nasr et al, 2019, Sablayrolles et al, 2019, Leino and Fredrikson, 2020, label-only access [Li andZhang, 2020, Choquette-Choo et al, 2021], federated learning [Nasr et al, 2019, transfer learning [Zou et al, 2020] and different types of data, models such as aggregate location data [Pyrgelis et al, 2017], generative models [Hayes et al, 2019], language models [Carlini et al, 2019, Carlini et al, 2020, sentence embeddings [Song and Raghunathan, 2020], and speech recognition models [Shah et al, 2021]. Multiple works have looked at improving the attack methodology through a more fine grained analysis or by reducing the background knowledge and the compute power required to execute the attack [Long et al, 2018, Song and Mittal, 2021, Salem et al, 2018. All these works follow the same attack framework for membership inference, but they either exploit a slightly different signal that is correlated with membership of a point in the training set or find an efficient way to exploit the already known signals.…”
Section: Related Workmentioning
confidence: 99%
“…4.1 Attack S: MIA via Shadow modelsStarting fromShokri et al [2017], a substantial body of literatureNasr et al [2019],Sablayrolles et al [2019],Leino and Fredrikson [2020],Long et al [2018],Song and Mittal [2021],Salem et al …”
mentioning
confidence: 99%
“…They formalize the attack as a binary classification task and utilize a neural network (NN) model along with shadow training technique to distinguish members of training set from non-members. Following this work, many MIAs have been proposed which can be divided into two categories: direct attacks [32,33,46,47,54,55], which directly query the target sample and typically utilize only a single query; indirect attacks [8,28,29], which query for samples that are in the neighborhood of the target sample to infer membership, and typically utilize multiple queries. The research community has further extended the MIA to federated settings [31,33] and generative models [14].…”
Section: Introductionmentioning
confidence: 99%