2013 IEEE International Conference on Healthcare Informatics 2013
DOI: 10.1109/ichi.2013.76
|View full text |Cite
|
Sign up to set email alerts
|

Perturbed Gibbs Samplers for Generating Large-Scale Privacy-Safe Synthetic Health Data

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
24
0

Year Published

2016
2016
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 21 publications
(24 citation statements)
references
References 12 publications
0
24
0
Order By: Relevance
“…(2) Privacy risk evaluation for synthetic table T ′ is also an independent research problem. This paper adopts two commonly-used metrics in the existing works [45,39,41], namely hitting rate and distance to the closest record (DCR). Intuitively, the metrics measure the likelihood that the original data records can be re-identified by an attacker.…”
Section: Problem Formalizationmentioning
confidence: 99%
See 2 more Smart Citations
“…(2) Privacy risk evaluation for synthetic table T ′ is also an independent research problem. This paper adopts two commonly-used metrics in the existing works [45,39,41], namely hitting rate and distance to the closest record (DCR). Intuitively, the metrics measure the likelihood that the original data records can be re-identified by an attacker.…”
Section: Problem Formalizationmentioning
confidence: 99%
“…The statistical approach aims at modeling a joint multivariate distribution for a dataset and then generating fake data by sampling from the distribution. To effectively capture dependence between variates, existing works utilize copulas [35,46], Bayesian networks [62,63], Gibbs sampling [45] and Fourier decompositions [12]. Synopses-based approaches, such as wavelets and multi-dimensional sketches, build compact data summary for massive data [19,55], which can be then used for estimating joint distribution.…”
Section: Related Work For Data Synthesismentioning
confidence: 99%
See 1 more Smart Citation
“…As they have claimed, their proposed method synthesises artificial records while maintaining the statistical features of the original records to the maximum extent possible. Using different data mining and statistical analysis methods, they have concluded that the synthetic dataset delivers results that largely similar to the original dataset [15].…”
Section: Synthesised Data In Real-world Applicationsmentioning
confidence: 99%
“…One approach is to generate synthetic individual participant data similar enough to the original trial data that analyses yield the same answers. Park and Ghosh developed an initial approach to managing privacy threats using a perturbed Gibbs sampler, a method that generates synthetic data with a quantifiable privacy risk (6). Goodfellow et al (7) developed a method entitled Generative Adversarial Networks (GANs) using neural networks to generate realistic data from complex distributions.…”
Section: Introductionmentioning
confidence: 99%