Big data analytics have become widespread as a means to extract knowledge from large datasets. Yet, the heterogeneity and irregularity usually associated with big data applications often overwhelm the existing software and hardware infrastructures. In such context, the flexibility and elasticity provided by the cloud computing paradigm offer a natural approach to cost-effectively adapting the allocated resources to the application's current needs. However, these same characteristics impose extra challenges to predicting the performance of cloud-based big data applications, a key step to proper management and planning. This paper explores three modeling approaches for performance prediction of cloud-based big data applications. We evaluate two queuing-based analytical models and a novel fast ad hoc simulator in various scenarios based on different applications and infrastructure setups. The three approaches are compared in terms of prediction accuracy, finding that our best approaches can predict average application execution times with 26% relative error in the very worst case and about 7% on average.
Big data analytics have become widespread as a means to extract knowledge from large datasets. Such applications are often characterized by highly heterogeneous and irregular data access patterns, challenging existing software and hardware infrastructures to meet their dynamic resource demands. The cloud computing paradigm, in turn, offers a natural hosting solution to such applications as it provides flexibility and elasticity, adapting the allocated resources in response to the application's current needs. However, these properties impose extra challenge to the accurate performance prediction of cloud-based applications, which is a key step to adequate capacity planning and managing of the hosting infrastructure. In this article, we tackle this challenge by exploring three modeling approaches for predicting the performance of big data applications running on the cloud. We evaluate two queuing-based analytical models and a novel fast ad-hoc simulator in various scenarios based on different applications and infrastructure setups. The considered approaches are compared in terms of prediction accuracy and execution time. Our results indicate that our two best approaches can predict average application execution times with only up to a 7% relative error, on average. Moreover, both of them run very fast (requiring at least two orders of magnitude lower execution time than widely used tools while providing slightly better accuracy), being practical for online prediction.
Defining smart city pillars, and their nature and essence, continues to be debated in the scientific literature. The vast amount of information collected by electronic devices, often regarded merely as a means of rationalizing the use of resources and improving efficiency, could also be considered as a pillar. Information by itself cannot be deciphered or understood without analysis performed by algorithms based on Artificial Intelligence. Such analysis extracts new forms of knowledge in the shape of correlations and patterns used to support the decision-making processes associated with governance and, ultimately, to define new policies. Alongside information, energy plays a crucial role in smart cities as many activities that lead to growth in the economy and employment depend on this pillar. As a result, it is crucial to highlight the link between energy and the algorithms able to plan and forecast the energy consumption of smart cities. The result of this paper consists in the highlighting of how AI and information together can be legitimately considered foundational pillars of smart cities only when their real impact, or value, has been assessed. Furthermore, Artificial Intelligence can be deployed to support smart grids, electric vehicles, and smart buildings by providing techniques and methods to enhance their innovative value and measured efficiency.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.