<p>Due to Internet of things and social media platforms, raw data is getting generated from systems around us in three sixty degree with respect to time, volume and type. Social networking is increasing rapidly to exploit business advertisements as business demands. In this regard there are many challenges for data management service providers, security is one among them. Data management service providers need to ensure security for their privileged customers in providing accurate and valid data. Since underlying transactional data have varying data characteristics such huge volume, variety and complexity, there is an essence of deploying such data sets on to the big data platforms which can handle structured, semi-structured and un-structured data sets. In this regard we propose a data masking technique for big data security. Data masking ensures proxy of original dataset with a different dataset which is not real but looks realistic. The given data set is masked using modulus operator and the concept of keys. Our experiment advocates enhanced modulus based data masking is better with respect to execution time and space utilization for larger data sets when compared to modulus based data masking. This work will help big data developers, quality analysts in the business domains and provides confidence for end-users in providing data security.</p>
In today’s predictive analytics world, data engineering play a vital role, data acquisition is carried out from various source systems and process as per the business applications and domain. Big Data integrates, governs, and secures big data with repeatable, reliable, and maintainable processes. Through volume, speed, and assortment of information characteristics try to reveal business esteem from enormous information. However, with information that is frequently deficient, conflicting, ungoverned, and unprotected, which is hazardous and enormous information being a risk instead of an advantage. What's more, with conventional methodologies that are manual and unpredictable, huge information ventures take too long to acknowledge business esteem. Reasonably and over and again conveying business esteem from enormous information requires another technique. In this connection, raw data has to be moved between onsite and offshore environment during this course of action, data privacy is a major concern and challenge. A Big Data Privacy platform can make it easier to detect, investigate, assess, and remediate threats from intruders. We tried to do complete study of Big Data Privacy using data masking methods on various data loads and different types. This work will help data quality analyst and big data developers while building the big data applications.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.