Abstract. Peer-to-Peer (P2P) systems have been very successful for large-scale data sharing. However, sharing sensitive data, like in online social networks, without appropriate access control, can have undesirable impact on data privacy. Data can be accessed by everyone (by potentially untrusted peers) and used for everything (e.g., for marketing or activities against the owner's preferences or ethics). Hippocratic databases (HDB) provide an effective solution to this problem, by integrating purposebased access control for privacy protection. However, the use of HDB has been restricted to centralized systems. This chapter gives an overview of current solutions for supporting data privacy in P2P systems, and develops in more details a complete solution based on HDB.Keywords: data privacy, P2P systems, DHT, Hippocratic databases, purposebased access control, trust.
IntroductionData privacy is the right of individuals to determine for themselves when, how, and to what extent information about them is communicated to others [40]. It has been treated by many organizations and legislations that have defined well accepted principles. According to OECD 3 , data privacy should consider: collection limitation, purpose specification, use limitation, data quality, security safeguards, openness, individual participation, and accountability. From these principles, we underline purpose specification which states that data owners should be able to specify the data access purposes for which their data will be collected, stored, and used.With the advent of Online Social Networks (OLSN), data privacy has become a major concern. An OLSN is formed by people having something in common and connected by social relationships, such as friendship, hobbies, or coworking, in order to exchange information [11]. Many communities use OLSNs to share data in both professional and non-professional environments. Examples of professional OLSNs are Shanoir 4 , designed for the neuroscience community to 3 Organization for Economic Co-operation and Development. One of the world's largest and most reliable source of comparable statistics on economic and social data (http://www.oecd.org/). 4 www.shanoir.org/ 3 archive, share, search, and visualize neuroimaging data, or medscape 5 , designed for the medical community to share medical experience and medical data. There are also non-professional OLSNs for average citizens and amateurs in different domains such as Carenity 6 , designed for patients and their relatives to share medical information about them in order to help medical research. Another example is DIYbio 7 , dedicated to make biology accessible for citizen scientists, amateur biologists, and biological engineers, who share research results. The most popular OLSN, Facebook, with hundreds millions of users, enables groups of friends to share all kinds of personal information among themselves.Scalable data sharing among community members is critical for an OLSN system. Two main solutions have emerged for scalable data sharing: cloud computing a...