Size and complexity of modern data centers pose scalability issues for the resource monitoring system supporting management operations, such as server consolidation. When we pass from cloud to multi-cloud systems, scalability issues are exacerbated by the need to manage geographically distributed data centers and exchange monitored data across them. While existing solutions typically consider every Virtual Machine (VM) as a black box with independent characteristics, we claim that scalability issues in multi-cloud systems could be addressed by clustering together VMs that show similar behaviors in terms of resource usage. In this paper, we propose an automated methodology to cluster VMs starting from the usage of multiple resources, assuming no knowledge of the services executed on them. This innovative methodology exploits the Bhattacharyya distance to measure the similarity of the probability distributions of VM resources usage, and automatically selects the most relevant resources to consider for the clustering process. The methodology is evaluated through a set of experiments with data from a cloud provider. We show that our proposal achieves high and stable performance in terms of automatic VM clustering. Moreover, we estimate the reduction in the amount of data collected to support system management in the considered scenario, thus showing how the proposed methodology may reduce the monitoring requirements in multi-cloud systems.
Fog computing is becoming popular as a solution to support applications based on geographically distributed sensors that produce huge volumes of data to be processed and filtered with response time constraints. In this scenario, typical of a smart city environment, the traditional cloud paradigm with few powerful data centers located far away from the sources of data becomes inadequate. The fog computing paradigm, which provides a distributed infrastructure of nodes placed close to the data sources, represents a better solution to perform filtering, aggregation, and preprocessing of incoming data streams reducing the experienced latency and increasing the overall scalability. However, many issues still exist regarding the efficient management of a fog computing architecture, such as the distribution of data streams coming from sensors over the fog nodes to minimize the experienced latency. The contribution of this paper is two-fold. First, we present an optimization model for the problem of mapping data streams over fog nodes, considering not only the current load of the fog nodes, but also the communication latency between sensors and fog nodes. Second, to address the complexity of the problem, we present a scalable heuristic based on genetic algorithms. We carried out a set of experiments based on a realistic smart city scenario: the results show how the performance of the proposed heuristic is comparable with the one achieved through the solution of the optimization problem. Then, we carried out a comparison among different genetic evolution strategies and operators that identify the uniform crossover as the best option. Finally, we perform a wide sensitivity analysis to show the stability of the heuristic performance with respect to its main parameters.
Cloud computing has recently emerged as a new paradigm to provide computing services through large-size data centers where customers may run their applications in a virtualized environment. The advantages of cloud in terms of flexibility and economy encourage many enterprises to migrate from local data centers to cloud platforms, thus contributing to the success of such infrastructures. However, as size and complexity of cloud infrastructures grow, scalability issues arise in monitoring and management processes. Scalability issues are exacerbated because available solutions typically consider each virtual machine (VM) as a black box with independent characteristics, which is monitored at a fine-grained granularity level for management purposes, thus generating huge amounts of data to handle. We claim that scalability issues can be addressed by leveraging the similarity between VMs in terms of resource usage patterns. In this paper, we propose an automated methodology to cluster similar VMs starting from their resource usage information, assuming no knowledge of the software executed on them. This is an innovative methodology that combines the Bhattacharyya distance and ensemble techniques to provide a stable evaluation of similarity between probability distributions of multiple VM resource usage, considering both system- and network-related data. We evaluate the methodology through a set of experiments on data coming from an enterprise data center. We show that our proposal achieves high and stable performance in automatic VMs clustering, with a significant reduction in the amount of data collected which allows to lighten the monitoring requirements of a cloud data center
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.