This paper describes a personalized k-anonymity model for protecting location privacy against various privacy threats through location information sharing. Our model has two unique features. First, we provide a unified privacy personalization framework to support location k-anonymity for a wide range of users with context-sensitive personalized privacy requirements. This framework enables each mobile node to specify the minimum level of anonymity it desires as well as the maximum temporal and spatial resolutions it is willing to tolerate when requesting for k-anonymity preserving location-based services (LBSs). Second, we devise an efficient message perturbation engine which runs by the location protection broker on a trusted server and performs location anonymization on mobile users' LBS request messages, such as identity removal and spatio-temporal cloaking of location information. We develop a suite of scalable and yet efficient spatio-temporal cloaking algorithms, called CliqueCloak algorithms, to provide high quality personalized location k-anonymity, aiming at avoiding or reducing known location privacy threats before forwarding requests to LBS provider(s). The effectiveness of our CliqueCloak algorithms is studied under various conditions using realistic location data synthetically generated using real road maps and traffic volume data.
This article addresses the profitability problem associated with auto-parallelization of general-purpose distributed data stream processing applications. Auto-parallelization involves locating regions in the application's data flow graph that can be replicated at run-time to apply data partitioning, in order to achieve scale. In order to make auto-parallelization effective in practice, the profitability question needs to be answered: How many parallel channels provide the best throughput? The answer to this question changes depending on the workload dynamics and resource availability at run-time. In this article, we propose an elastic auto-parallelization solution that can dynamically adjust the number of channels used to achieve high throughput without unnecessarily wasting resources. Most importantly, our solution can handle partitioned stateful operators via run-time state migration, which is fully transparent to the application developers. We provide an implementation and evaluation of the system on an industrial-strength data stream processing platform to validate our solution.
Various research communities have independently arrived at stream processing as a programming model for high-performance and parallel computation, including digital signal processing, databases, operating systems, and complex event processing. Each of these communities has developed some of the same optimizations, but often with conflicting terminology and unstated assumptions. This paper presents a survey of optimizations for stream processing. It is aimed both at users who need to understand and guide the system's optimizer, and at implementers who need to make engineering trade-offs. To consolidate terminology, this paper is organized as a catalog, in a style similar to catalogs of design patterns or refactorings. To make assumptions explicit and help with trade-offs, each optimization is presented with its safety constraints (when does it preserve correctness?) and a profitability experiment (when does it improve performance?). We hope that this survey will help future optimization inventors to stand on the shoulders of giants from not just their own community.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.