Live migration allows a running operating system (OS) to be moved to another physical machine with negligible downtime. Unfortunately, live migration is not supported in bare-metal clouds, which lease physical machines rather than virtual machines to offer maximum hardware performance. Since bare-metal clouds have no virtualization software, implementing live migration is difficult. Previous studies have proposed OS-level live migration; however, to prevent user intervention and broaden OS choices, live migration should be OS-independent. In addition, the overhead of live migration mechanisms should be as low as possible. This paper introduces BLMVisor, a live migration scheme for bare-metal clouds. To achieve OS-independent and lightweight live migration, BLMVisor utilizes a very thin hypervisor that exposes physical hardware devices to the guest OS directly rather than virtualizing the devices. The hypervisor captures, transfers, and reconstructs physical device states by monitoring access from the guest OS and controlling the physical devices with effective techniques. To minimize performance degradation, the hypervisor is mostly idle after completing the live migration. A performance evaluation confirmed that the OS performance with BLMVisor is comparable to that of a bare-metal machine.
Scientific communities are increasingly adopting machine learning and deep learning models in their applications to accelerate scientific insights. High performance computing systems are pushing the frontiers of performance with a rich diversity of hardware resources and massive scale-out capabilities. There is a critical need to understand fair and effective benchmarking of machine learning applications that are representative of real-world scientific use cases. MLPerf TM is a community-driven standard to benchmark machine learning workloads, focusing on end-to-end performance metrics. In this paper, we introduce MLPerf HPC, a benchmark suite of largescale scientific machine learning training applications, driven by the MLCommons TM Association. We present the results from the first submission round including a diverse set of some of the world's largest HPC systems. We develop a systematic framework for their joint analysis and compare them in terms of data staging, algorithmic convergence and compute performance. As a result, we gain a quantitative understanding of optimizations on different subsystems such as staging and on-node loading of data, compute-unit utilization and communication scheduling enabling overall > 10× (end-to-end) performance improvements through system scaling. Notably, our analysis shows a scale-dependent interplay between the dataset size, a system's memory hierarchy and training convergence that underlines the importance of nearcompute storage. To overcome the data-parallel scalability challenge at large batch-sizes, we discuss specific learning techniques and hybrid data-and-model parallelism that are effective on large systems. We conclude by characterizing each benchmark with respect to low-level memory, I/O and network behaviour to parameterize extended roofline performance models in future rounds.
Live migration, underpinned by virtualisation technologies, has enabled improved manageability and fault tolerance for servers. However, virtualised server infrastructures suffer from significant processing overheads, system inconsistencies, security issues and unpredictable performance which makes them unsuitable for low-power and resource-constraint computing devices that processing latency-sensitive, "Big-data"-type data. Consequently, we ask: "How do we eliminate the overhead of virtualisation whilst still retaining its benefits?" Motivated by this question, we investigate a practical approach for a bare-metal live migration scheme for ARM-based instances low-power servers and edge devices. In this paper, we position ARM-based bare-metal live migration as a technique that will underpin the efficiency on edge-computing and on Microdatacentres. We also introduce our early work on identifying three key technical challenges and discuss their solutions.
Experimental verification of the effects of radially sheared electric-field (or potential) formation in plasmas is one of the most critical issues to understand the physics basis for plasma confinement improvements. In the GAMMA 10 tandem mirror, recent experimental results show shear formation effects on the suppression of not only coherent drift waves but turbulence-like fluctuations without any coherent phasing relation during the ion-confining potential formation period. Contours of the central-cell soft x-ray brightness show spatially and temporally fluctuated structures during a weak sheared period by the use of the 50 channel microchannel plate system. A new x-ray tomography system is developed for analyzing temporally and spatially resolved plasma behavior in the presence or absence of these shear formation effects in GAMMA 10. The system consists of two 48-channel silicon semiconductor detector arrays with different viewing angles. X-ray energy responses of the new detector arrays along with response uniformity of detector channels have been characterized using synchrotron radiation at the Photon Factory.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.