2022
DOI: 10.1093/jamiaopen/ooac083
|View full text |Cite
|
Sign up to set email alerts
|

Validating a membership disclosure metric for synthetic health data

Abstract: Background One of the increasingly accepted methods to evaluate the privacy of synthetic data is by measuring the risk of membership disclosure. This is a measure of the F1 accuracy that an adversary would correctly ascertain that a target individual from the same population as the real data is in the dataset used to train the generative model, and is commonly estimated using a data partitioning methodology with a 0.5 partitioning parameter. O… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
10
0

Year Published

2023
2023
2024
2024

Publication Types

Select...
7

Relationship

3
4

Authors

Journals

citations
Cited by 17 publications
(10 citation statements)
references
References 49 publications
0
10
0
Order By: Relevance
“…It is a measure of the precision of the parameter estimate across runs. We would want this to be as small as possible Privacy The membership disclosure metric computed on the pooled datasets for that value of m 95 . The acceptable threshold for this relative F1 score metric is 0.2 67 , 94 , 95 …”
Section: Methodsmentioning
confidence: 99%
See 3 more Smart Citations
“…It is a measure of the precision of the parameter estimate across runs. We would want this to be as small as possible Privacy The membership disclosure metric computed on the pooled datasets for that value of m 95 . The acceptable threshold for this relative F1 score metric is 0.2 67 , 94 , 95 …”
Section: Methodsmentioning
confidence: 99%
“…Privacy risks were computed using a membership disclosure metric 95 . Membership disclosure evaluates the ability of an adversary to correctly determine if a target individual is in the original data that was used to train the generative model.…”
Section: Methodsmentioning
confidence: 99%
See 2 more Smart Citations
“…The data authority that regulates the use of RWD in Finland (Findata) requires the use of k -anonymity ( k = 5), and this was considered as the starting point for the selection of the anonymization framework. Given this premise, two candidate approaches were evaluated using membership inference attacks [ 33 – 35 ]. First, the ε-safe k-anonymization [ 34 ] that may offer slight improvement against membership inference attacks, and it also considers the differential privacy composability problem in the case of multiple data publications.…”
Section: Methodsmentioning
confidence: 99%