Abstract-Preserving individual privacy when publishing data is a problem that is receiving increasing attention. According to the k-anonymity principle, each release of data must be such that each individual is indistinguishable from at least k − 1 other individuals. In this paper we study the problem of anonymity preserving data publishing in moving objects databases. We propose a novel concept of k-anonymity based on co-localization that exploits the inherent uncertainty of the moving object's whereabouts. Due to sampling and positioning systems (e.g., GPS) imprecision, the trajectory of a moving object is no longer a polyline in a three-dimensional space, instead it is a cylindrical volume, where its radius δ represents the possible location imprecision: we know that the trajectory of the moving object is within this cylinder, but we do not know exactly where. If another object moves within the same cylinder they are indistinguishable from each other. This leads to the definition of (k, δ)-anonymity for moving objects databases.We first characterize the (k, δ)-anonymity problem and discuss techniques to solve it. Then we focus on the most promising technique by the point of view of information preservation, namely space translation. We develop a suitable measure of the information distortion introduced by space translation, and we prove that the problem of achieving (k, δ)-anonymity by space translation with minimum distortion is NP-hard. Faced with the hardness of our problem we propose a greedy algorithm based on clustering and enhanced with ad hoc pre-processing and outlier removal techniques. The resulting method, named N WA (N ever Walk Alone), is empirically evaluated in terms of data quality and efficiency.Data quality is assessed both by means of objective measures of information distortion, and by comparing the results of the same spatio-temporal range queries executed on the original database and on the (k, δ)-anonymized one. Experimental results show that for a wide range of values of δ and k, the relative error introduced is kept low, confirming that N WA produces high quality (k, δ)-anonymized data.
Background: An important step in annotation of sequenced genomes is the identification of transcription factor binding sites. More than a hundred different computational methods have been proposed, and it is difficult to make an informed choice. Therefore, robust assessment of motif discovery methods becomes important, both for validation of existing tools and for identification of promising directions for future research.
Background: Computational discovery of regulatory elements is an important area of bioinformatics research and more than a hundred motif discovery methods have been published. Traditionally, most of these methods have addressed the problem of single motif discoverydiscovering binding motifs for individual transcription factors. In higher organisms, however, transcription factors usually act in combination with nearby bound factors to induce specific regulatory behaviours. Hence, recent focus has shifted from single motifs to the discovery of sets of motifs bound by multiple cooperating transcription factors, so called composite motifs or cisregulatory modules. Given the large number and diversity of methods available, independent assessment of methods becomes important. Although there have been several benchmark studies of single motif discovery, no similar studies have previously been conducted concerning composite motif discovery.
The process of discovering relevant patterns holding in a database was first indicated as a threat to database security by O'Leary in [1]. Since then, many different approaches for knowledge hiding have emerged over the years, mainly in the context of association rules and frequent item sets mining. Following many real-world data and application demands, in this paper, we shift the problem of knowledge hiding to contexts where both the data and the extracted knowledge have a sequential structure. We define the problem of hiding sequential patterns and show its NP-hardness. Thus, we devise heuristics and a polynomial sanitization algorithm. Starting from this framework, we specialize it to the more complex case of spatiotemporal patterns extracted from moving objects databases. Finally, we discuss a possible kind of attack to our model, which exploits the knowledge of the underlying road network, and enhance our model to protect from this kind of attack. An exhaustive experiential analysis on real-world data sets shows the effectiveness of our proposal.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.