Privacy preserving data mining techniques helps in providing security to sensitive information from unauthorized access. Large amount of data is collected in many organizations through data mining. So privacy of data becomes the most important issue in the recent years. Several numbers of techniques such as generalization, bucketization, anonymization have been proposed for privacy preserving data publishing. Generalization loses significant amount of information especially for high-dimensional data according to recent works. Whereas bucketization does not prevent the membership disclosure and cannot applicable to data that does not have clear separation between quasi-identifiers and sensitive attributes. In this paper, we present a slicing technique to prevent generalized loses and membership disclosure. It can also handle high -dimensional data and develops efficient algorithm for computing the sliced data that obeys the ℓ -diversity check requirement. Slicing preserves better utility than generalization and is more effective than bucketization in workloads involving the sensitive attribute in our experiment.
KEYWORDS:Generalization, bucketization, ℓ -diversity, slicing, data publishing.
I. INTRODUCTIONData mining is the exploration and analysis of large quantities of data in order to discover valid, novel, potentially process that plays a vital role in extraction of useful information. Today huge databases exist in various applications i.e. Medical data, census data, communication and media-related data, consumer purchase data and data gathered by government agencies etc. So data sharing is needed for full utilization of collected data because pooling of medical data can improve the quality of medical research also the data gathered by the government (e.g. census data) should be made publicly available for calculating the population of country, calculating the numbers of candidates who become eligible for voting etc. As the private information of individuals are public or distributed online so privacy become the important issue these days. For this reason various privacy preserving techniques (PPDM) are must applied with data mining algorithm so that the private information of the individual can be protected during the extraction of sensitive information in the knowledge finding process that is also known as KDD process. Micro data has received lots of attention in the recent years. Today many organizations publish their micro data. This information includes details about individual entity, organization, firm, industry, person etc. Main objective of privacy preserving data mining techniques is to modify the original data in such a way that the private information is not revealed as well as the data remains useful for the analysis purpose. Most popular techniques used for microdata anonymization are generalization and bucketization which involves various attributes. In both the approaches attributes are divided into three categories which includes identifiers, quasi-identifiers, and sensitive identif...