2009
DOI: 10.14778/1687627.1687733
|View full text |Cite
|
Sign up to set email alerts
|

Anonymization of set-valued data via top-down, local generalization

Abstract: Set-valued data, in which a set of values are associated with an individual, is common in databases ranging from market basket data, to medical databases of patients' symptoms and behaviors, to query engine search logs. Anonymizing this data is important if we are to reconcile the conflicting demands arising from the desire to release the data for study and the desire to protect the privacy of individuals represented in the data. Unfortunately, the bulk of existing anonymization techniques, which were develope… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
4
1

Citation Types

2
256
0
4

Year Published

2013
2013
2023
2023

Publication Types

Select...
4
3
2

Relationship

0
9

Authors

Journals

citations
Cited by 157 publications
(262 citation statements)
references
References 35 publications
(54 reference statements)
2
256
0
4
Order By: Relevance
“…However, the applicability of our solution can be broader, as VGHs are the most used approach in generalization to protect privacy in different types of data. For example, in semantic trajectory data [15], VGHs are used to hide sensitive places where a person has stopped (e.g., an oncology clinic); while in transactional data [10], they are used to hide sensitive items in purchases (e.g., pregnancy test).…”
Section: Final Discussionmentioning
confidence: 99%
“…However, the applicability of our solution can be broader, as VGHs are the most used approach in generalization to protect privacy in different types of data. For example, in semantic trajectory data [15], VGHs are used to hide sensitive places where a person has stopped (e.g., an oncology clinic); while in transactional data [10], they are used to hide sensitive items in purchases (e.g., pregnancy test).…”
Section: Final Discussionmentioning
confidence: 99%
“…Yeye et al in [10] propose a top-down, partition based approach to anonymizing set-value data that preserves better utility most of the time. While this approach works sufficiently well for query log anonymization it does not work well with market-basket type of data.…”
Section: Related Workmentioning
confidence: 99%
“…The original set-valued data privacy problem was defined in the context of association rule hiding [1,15,16], in which the data publisher wishes to "sanitize" the set-valued data (or micro-data) so that all sensitive or "bad" associate rules cannot be discovered while all (or most) "good" rules remain in the published data. Subsequently, a number of privacy models including (h, k, p)-coherence [18], k m -anonymity [14], k-anonymity [9] and ρ-uncertainty [3] have been proposed. k m -anonymity and k-anonymity are carried over directly from relational data privacy, while (h, k, p)-coherence and ρ-uncertainty protect the privacy by bounding the confidence and the support of any sensitive association rule inferrable from the data.…”
Section: Related Workmentioning
confidence: 99%
“…These generally fall in four categories: global/local generalization [14,9,3], global suppression [18,3], permutation [8] and perturbation [19,4]. Next we briefly discuss the pros and cons of these anonymization techniques.…”
Section: Related Workmentioning
confidence: 99%