Distributed data processing frameworks (e.g., Hadoop, Spark, and Flink) are widely used to distribute data among computing nodes of a cloud. Recently, there have been increasing efforts aimed at evaluating the performance of distributed data processing frameworks hosted in private and public clouds. However, there is a paucity of research on evaluating the performance of these frameworks hosted in a hybrid cloud, which is an emerging cloud model that integrates private and public clouds to use the best of both worlds. Therefore, in this paper, we evaluate the performance of Hadoop, Spark, and Flink in a hybrid cloud in terms of execution time, resource utilization, horizontal scalability, vertical scalability, and cost. For this study, our hybrid cloud consists of OpenStack (private cloud) and MS Azure (public cloud). We use both batch and iterative workloads for the evaluation. Our results show that in a hybrid cloud (i) the execution time increases as more nodes are borrowed by the private cloud from the public cloud, (ii) Flink outperforms Spark, which in turn outperforms Hadoop in terms of execution time, (iii) Hadoop transfers the largest amount of data among the nodes during the workload execution while Spark transfers the least amount of data, (iv) all three frameworks horizontally scale better as compared to vertical scaling, and (v) Spark is found to be least expensive in terms of $ cost for data processing while Hadoop is found the most expensive.
Searching is the main fundamental concept using in computer Science to get the data from unsorted and sorted elements. There are a number of searching algorithms present in computer science. Here, we are going to introduce with new searching algorithms i.e. 10x Matrix Searching algorithms. 10x refer to a table which contains only 10 rows and x refer to the columns associated with each row. So, basically we are going to search an item from a collection of items. These item may be a part of database, may be a part of some formula or procedure which are going to use in our programs, like we are finding some solution of an equation and we need the root of an equation.
In computer Science, Sorting algorithm is used to place the elements in a certain order or to arrange the data of the array or string in a particular order. Sorted Data has been used to design the database which contains the array values in a certain order. Typically, Sorting Algorithm is used to arrange the data either in increasing order to decreasing order. There are certain types of sorting algorithm which are used to sort an array and have different cost and time complexity but we have discussed only new sorting algorithms designed by us.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.