To cite this version:Raghavendran Balu, Teddy Furon, Sébastien Gambs. Abstract. In this paper, we consider personalized recommendation systems in which before publication, the profile of a user is sanitized by a non-interactive mechanism compliant with the concept of differential privacy. We consider two existing schemes offering a differentially private representation of profiles: BLIP (BLoom-and-flIP) and JLT (JohnsonLindenstrauss Transform). For assessing their security levels, we play the role of an adversary aiming at reconstructing a user profile. We compare two inference attacks, namely single and joint decoding. The first one decides of the presence of a single item in the profile, and sequentially explores all the item set. The latter strategy decides whether a subset of items is likely to be the user profile, and considers all the possible subsets. Our contributions are a theoretical analysis as well as a practical implementation of both attacks, which were evaluated on datasets of real user profiles. The results obtained clearly demonstrates that joint decoding is the most powerful attack, while also giving useful insights on how to set the differential privacy parameter ǫ.
Many nearest neighbor search algorithms rely on encoding real vectors into binary vectors. The most common strategy projects the vectors onto random directions and takes the sign to produce so-called sketches. This paper discusses the sub-optimality of this choice, and proposes a better encoding strategy based on the quantization and reconstruction points of view. Our second contribution is a novel asymmetric estimator for the cosine similarity. Similar to previous asymmetric schemes, the query is not quantized and the similarity is computed in the compressed domain.Both our contribution leads to improve the quality of nearest neighbor search with binary codes. Its efficiency compares favorably against a recent encoding technique.
International audienceCollaborative filtering is a popular technique for recommendation system due to its domain independence and reliance on user behavior data alone. But the possibility of identification of users based on these personal data raise privacy concerns. Differential privacy aims to minimize these identification risks by adding controlled noise with known characteristics. The addition of noise impacts the utility of the system and does not add any other value to the system other than enhanced privacy. We propose using sketching techniques to implicitly provide the differential privacy guarantees by taking advantage of the inherent randomness of the data structure. In particular, we use count sketch as a storage model for matrix factorization, one of the successful collaborative filtering techniques. Our model is also compact and scales well with data, making it well suitable for large scale applications
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.