Proceedings of the Twenty-Third ACM SIGMOD-SIGACT-SIGART Symposium on Principles of Database Systems 2004
DOI: 10.1145/1055558.1055591
|View full text |Cite
|
Sign up to set email alerts
|

On the complexity of optimal K-anonymity

Abstract: The technique of k-anonymization has been proposed in the literature as an alternative way to release public information, while ensuring both data privacy and data integrity. We prove that two general versions of optimal k-anonymization of relations are N P -hard, including the suppression version which amounts to choosing a minimum number of entries to delete from the relation. We also present a polynomial time algorithm for optimal k-anonymity that achieves an approximation ratio independent of the size of t… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
3
1
1

Citation Types

1
468
0
9

Year Published

2007
2007
2022
2022

Publication Types

Select...
5
4
1

Relationship

0
10

Authors

Journals

citations
Cited by 645 publications
(478 citation statements)
references
References 11 publications
1
468
0
9
Order By: Relevance
“…By generalization, the quasi-identifier values are replaced with less specific ones (e.g., replace specific age with a range of ages), so that after generalization, the original dataset is partitioned into groups, with each group consisting of at least k tuples that are of the same generalized quasiidentifier values [4,[13][14] .…”
Section: Generalization-based Techniquesmentioning
confidence: 99%
“…By generalization, the quasi-identifier values are replaced with less specific ones (e.g., replace specific age with a range of ages), so that after generalization, the original dataset is partitioned into groups, with each group consisting of at least k tuples that are of the same generalized quasiidentifier values [4,[13][14] .…”
Section: Generalization-based Techniquesmentioning
confidence: 99%
“…The complexity of this algorithm is O(k log k), where the constant in the big-O is less than 4. Although the runtime of this algorithm is exponential in k, its efficiency can be greatly enhanced as suggested by [12].…”
Section: Performance Assessmentmentioning
confidence: 99%
“…A greedy algorithm aimed at guaranteeing that no address unknown by the adversary can be linked with an user with probability higher than a given threshold is proposed in [21]. The main problem with this approach is that dealing with all possible adversary's knowledge becomes harder than the original k-anonymity problem, which is already known to be NP-Hard [13]. There exist other suppression-based methods in the literature, e.g., [6].…”
Section: Introductionmentioning
confidence: 99%