The dramatic proliferation of virtual machines (VMs) in datacenters and the highly-dynamic and transient nature of VM provisioning has revolutionized datacenter operations. However, the management of these environments is still carried out using re-purposed versions of traditional agents, originally developed for managing physical systems, or most recently via newer virtualization-aware alternatives that require guest cooperation and accessibility. We show that these existing approaches are a poor match for monitoring and managing (virtual) systems in the cloud due to their dependence on guest cooperation and operational health, and their growing lifecycle management overheads in the cloud. In this work, we first present Near Field Monitoring (NFM), our non-intrusive, out-of-band cloud monitoring and analytics approach that is designed based on cloud operation principles and to address the limitations of existing techniques. NFM decouples system execution from monitoring and analytics functions by pushing monitoring out of the targets systems' scope. By leveraging and extending VM introspection techniques, our framework provides simple, standard interfaces to monitor running systems in the cloud that require no guest cooperation or modification, and have minimal effect on guest execution. By decoupling monitoring and analytics from target system context, NFM provides ``always-on'' monitoring, even when the target system is unresponsive. NFM also works ``out-of-the-box'' for any cloud instance as it eliminates any need for installing and maintaining agents or hooks in the monitored systems. We describe the end-to-end implementation of our framework with two real-system prototypes based on two virtualization platforms. We discuss the new cloud analytics opportunities enabled by our decoupled execution, monitoring and analytics architecture. We present four applications that are built on top of our framework and show their use for across-time and across-system analytics.
Today's cloud service architectures follow a "one size fits all" deployment strategy where the same service version instantiation is provided to the end users. However, consumers are broad and different applications have different accuracy and responsiveness requirements, which as we demonstrate renders the "one size fits all" approach inefficient in practice. We use a production grade speech recognition engine, which serves several thousands of users, and an open source computer vision based system, to explain our point. To overcome the limitations of the "one size fits all" approach, we recommend Tolerance Tiers where each MLaaS tier exposes an accuracy/responsiveness characteristic, and consumers can programmatically select a tier. We evaluate our proposal on the CPU-based automatic speech recognition (ASR) engine and cutting-edge neural networks for image classification deployed on both CPUs and GPUs. The results show that our proposed approach provides a MLaaS cloud service architecture that can be tuned by the end API user or consumer to outperform the conventional "one size fits all" approach.
The dramatic proliferation of virtual machines (VMs) in datacenters and the highly-dynamic and transient nature of VM provisioning has revolutionized datacenter operations. However, the management of these environments is still carried out using re-purposed versions of traditional agents, originally developed for managing physical systems, or most recently via newer virtualization-aware alternatives that require guest cooperation and accessibility. We show that these existing approaches are a poor match for monitoring and managing (virtual) systems in the cloud due to their dependence on guest cooperation and operational health, and their growing lifecycle management overheads in the cloud.In this work, we first present Near Field Monitoring (NFM), our non-intrusive, out-of-band cloud monitoring and analytics approach that is designed based on cloud operation principles and to address the limitations of existing techniques. NFM decouples system execution from monitoring and analytics functions by pushing monitoring out of the targets systems' scope. By leveraging and extending VM introspection techniques, our framework provides simple, standard interfaces to monitor running systems in the cloud that require no guest cooperation or modification, and have minimal effect on guest execution. By decoupling monitoring and analytics from target system context, NFM provides "always-on" monitoring, even when the target system is unresponsive. NFM also works "out-of-the-box" for any cloud instance as it eliminates any need for installing and maintaining agents or hooks in the monitored systems. We describe the end-toend implementation of our framework with two real-system prototypes based on two virtualization platforms. We discuss the new cloud analytics opportunities enabled by our decoupled execution, monitoring and analytics architecture. We present four applications that are built on top of our framework and show their use for across-time and acrosssystem analytics.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.