Data Migration is a multi-step process that begins with analyzing old data and culminates in data uploading and reconciliation in new applications. With the rapid growth of data, organizations constantly need to migrate data. Data migration can be a complex process as testing must be done to ensure data quality. Migration also can be very costly if best practices are not followed and hidden costs are not identified in the early stage. On the other hand, many organizations today instead of buying IT equipment (hardware and/or software) and managing it themselves, they prefer to buy services from IT service providers. The number of service providers is increasing dramatically and the cloud is becoming the preferred tool for more cloud storage services. However, as more information and personal data are transferred to the cloud, to social media sites, DropBox, Baidu WangPan, etc., data security and privacy issues are questioned. So, academia and industry circles strive to find an effective way to secure data migration in the cloud. Various resolving methods and encryption techniques have been implemented. In this work, we will try to cover many important points in data migration as Strategy, Challenges, Need, methodology, Categories, Risks, and Uses with Cloud computing. Finally, we discuss data migration security and privacy challenge and how to solve this problem by making improvements in it's using with Cloud through suggested proposed model that enhances data security and privacy by gathering Advanced Encryption Standard-256 (ATS256), Data Dispersion Algorithms and Secure Hash Algorithm-512. This model achieves verifiable security ratings and fast execution times.How to cite this paper: Hussein, A.A.
Day by day advanced web technologies have led to tremendous growth amount of daily data generated volumes. This mountain of huge and spread data sets leads to phenomenon that called big data which is a collection of massive, heterogeneous, unstructured, enormous and complex data sets. Big Data life cycle could be represented as, Collecting (capture), storing, distribute, manipulating, interpreting, analyzing, investigate and visualizing big data. Traditional techniques as Relational Database Management System (RDBMS) couldn’t handle big data because it has its own limitations, so Advancement in computing architecture is required to handle both the data storage requisites and the weighty processing needed to analyze huge volumes and variety of data economically. There are many technologies manipulating a big data, one of them is hadoop. Hadoop could be understand as an open source spread data processing that is one of the prominent and well known solutions to overcome handling big data problem. Apache Hadoop was based on Google File System and Map Reduce programming paradigm. Through this paper we dived to search for all big data characteristics starting from first three V's that have been extended during time through researches to be more than fifty six V's and making comparisons between researchers to reach to best representation and the precise clarification of all big data V’s characteristics. We highlight the challenges that face big data processing and how to overcome these challenges using Hadoop and its use in processing big data sets as a solution for resolving various problems in a distributed cloud based environment. This paper mainly focuses on different components of hadoop like Hive, Pig, and Hbase, etc. Also we institutes absolute description of Hadoop Pros and cons and improvements to face hadoop problems by choosing proposed Cost-efficient Scheduler Algorithm for heterogeneous Hadoop system.
Abstract-With large growth in technology, reduced cost of storage media and networking enabled the organizations to collect very large volume of information from huge sources. Different data mining techniques are applied on such huge data to extract useful and relevant knowledge. The disclosure of sensitive data to unauthorized parties is a critical issue for organizations which could be most critical problem of data mining. So Privacy preserving data mining (PPDM) has become increasingly popular because it solves this problem and allows sharing of privacy sensitive data for analytical purposes. A lot of privacy techniques were developed based on the k-anonymity property. Because of a lot of shortcomings of the k-anonymity model, other privacy models were introduced. Most of these techniques release one table for research public after they applied on original tables. In this paper the researchers introduce techniques which publish more than one table for organizations preserving individual's privacy. One of this is (α, k) -anonymity using lossy-Join which releases two tables for publishing in such a way that the privacy protection for (α, k)-anonymity can be achieved with less distortion, and the other one is Anatomy technique which releases all the quasi-identifier and sensitive values directly in two separate tables, met l-diversity privacy requirements, without any modification in the original table.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2025 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.