In recent years the amount of digital data in the world has risen immensely.
But, the more information exists, the greater is the possibility of its
unwanted disclosure. Thus, the data privacy protection has become a pressing
problem of the present time. The task of individual privacy-preserving is being
thoroughly studied nowadays. At the same time, the problem of statistical
disclosure control for collective (or group) data is still open. In this paper
we propose an effective and relatively simple (wavelet-based) way to provide
group anonymity in collective data. We also provide a real-life example to
illustrate the method.Comment: 10 pages, 2 tables. Published by Springer in "Information Processing
and Management of Uncertainty in Knowledge-Based Systems. Applications". The
final publication is available at
http://www.springerlink.com/content/u701148783683775
In the era of Big Data, it is almost impossible to completely restrict access to primary non-aggregated statistical data. However, risk of violating privacy of individual respondents and groups of respondents by analyzing primary data has not been reduced. There is a need in developing subtler methods of data protection to come to grips with these challenges. In some cases, individual and group privacy can be easily violated, because the primary data contain attributes that uniquely identify individuals and groups thereof. Removing such attributes from the dataset is a crude solution and does not guarantee complete privacy. In the field of providing individual data anonymity, this problem has been widely recognized, and various methods have been proposed to solve it. In the current work, we demonstrate that it is possible to violate group anonymity as well, even if those attributes that uniquely identify the group are removed. As it turns out, it is possible to use third-party data to build a fuzzy model of a group. Typically, such a model comes in a form of a set of fuzzy rules, which can be used to determine membership grades of respondents in the group with a level of certainty sufficient to violate group anonymity. In the work, we introduce an evolutionary computing based method to build such a model. We also discuss a memetic approach to protecting the data from group anonymity violation in this case.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.