2015
DOI: 10.1016/j.datak.2015.07.009
|View full text |Cite
|
Sign up to set email alerts
|

Revisiting distance-based record linkage for privacy-preserving release of statistical datasets

Abstract: Statistical Disclosure Control (SDC, for short) studies the problem of privacy-preserving data publishing in cases where the data is expected to be used for statistical analysis. An original dataset T containing sensitive information is transformed into a sanitized version T which is released to the public. Both utility and privacy aspects are very important in this setting. For utility, T must allow data miners or statisticians to obtain similar results to those which would have been obtained from the origina… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
4
0

Year Published

2017
2017
2023
2023

Publication Types

Select...
6
2

Relationship

0
8

Authors

Journals

citations
Cited by 12 publications
(4 citation statements)
references
References 44 publications
0
4
0
Order By: Relevance
“…The concepts of linkability and inference attacks have extensively been studied both on synthetically generated datasets [56] and datasets that were anonymized with other approaches [25,37,59]. While many practical inference attacks exist, e.g.…”
Section: Related Workmentioning
confidence: 99%
“…The concepts of linkability and inference attacks have extensively been studied both on synthetically generated datasets [56] and datasets that were anonymized with other approaches [25,37,59]. While many practical inference attacks exist, e.g.…”
Section: Related Workmentioning
confidence: 99%
“…Hence, the dyadic product can be replaced with some other advanced concepts for further improvements. (iii) In [3], the selection of applicable trusted infrastructure, service provider, and algorithms is still inadequate to satisfy user confidentiality requirements. (iv) e security method in [4] can be extended by including some effective sensitive attributes from the anonymous data in order to improve security.…”
Section: Challengesmentioning
confidence: 99%
“…e perturbation approach produces some changes to input data, whereas the generalization approach replaces the original elements with less accurate elements, and synthetic data generators generate the synthetic data similar to the original data [3,4]. Moreover, other protection methods employed to ensure the secrecy of information are data sanitation, blocking, cryptography, and anonymization.…”
Section: Introductionmentioning
confidence: 99%
“…(Harron et al, 2015) A range of sophisticated linkage algorithms exist. For example, EM algorithms using matching weights, (Belin and Rubin, 1995) distance-based algorithms, (Herranz et al, 2015) and prior-informed imputation, (Harron et al, 2014) have all been investigated. Others have explored machine learning methods, which have potential for very large linkage problems (Elfeky et al, 2003) and further guidance on the value of these methods is needed.…”
Section: Data Linkagementioning
confidence: 99%