Recent research and bug reports have shown that work conservation, the property that a core is idle only if no other core is overloaded, is not guaranteed by Linux's CFS or FreeBSD's ULE multicore schedulers. Indeed, multicore schedulers are challenging to specify and verify: they must operate under stringent performance requirements, while handling very large numbers of concurrent operations on threads. As a consequence, the verification of correctness properties of schedulers has not yet been considered. In this paper, we propose an approach, based on a domainspecific language and theorem provers, for developing schedulers with provable properties. We introduce the notion of concurrent work conservation (CWC), a relaxed definition of work conservation that can be achieved in a concurrent system where threads can be created, unblocked and blocked concurrently with other scheduling events. We implement several scheduling policies, inspired by CFS and ULE. We show that our schedulers obtain the same level of performance as production schedulers, while concurrent work conservation is satisfied.
The container mechanism supports server consolidation; to ensure memory performance isolation, Linux relies on static memory limits. However, this results in poor performance, because an application needs are dynamic. In this article we will show current problems with memory consolidation for containers in Linux.
The container mechanism amortizes costs by consolidating several servers onto the same machine, while keeping them mutually isolated. Specifically, to ensure performance isolation, Linux relies on memory limits. These limits are static, despite the fact that application needs are dynamic; this results in poor performance. To solve this issue, MemOpLight uses dynamic application feedback to rebalance physical memory allocation between containers focusing on under-performing ones. This paper presents the issues, explains the design of MemOpLight, and validates it experimentally. Our approach increases total satisfaction by 13% compared to the default.
Abstract-The thriving success of the Cloud Industry greatly relies on the fact that virtual resources are as good as bare metal resources when it comes to ensuring a given level of quality of service. Thanks to the isolation provided by virtualization techniques based on hypervisors, a big physical resource can be spatially multiplexed into smaller virtual resources which are easier to sell. Unfortunately, virtual machines have quickly shown their limit in terms of temporal multiplexing. It has been demonstrated that reclaiming the unused memory of a VM is a tedious task, infeasible in production. Today, containerization opens up a wide range of multiplexing opportunities that were not accessible through machine virtualization.However, in this article, we demonstrate, through a reproducible experiment, that the current implementation of memory consolidation can deteriorate the performance of applications deployed in Linux kernel containers. Indeed, we observed that when a new container boots, the memory of active containers is reclaimed while unused memory is still available in other containers that are inactive. To tackle these performance drop in active containers, we have rethought the hierarchical memory reclaim mechanism of the Linux kernel. We have implemented inside the kernel our new approach that tracks the container that has made a memory demand the least recently. Our evaluations show that our approach provides the ability to reclaim memory without disturbing performances.
The complexity of computer architectures has risen since the early years of the Linux kernel: Simultaneous Multi-Threading (SMT), multicore processing, and frequency scaling with complex algorithms such as Intel ® Turbo Boost have all become omnipresent. In order to keep up with hardware innovations, the Linux scheduler has been rewritten several times, and many hardware-related heuristics have been added. Despite this, we show in this paper that a fundamental problem was never identified: the POSIX process creation model, i.e., fork/wait, can behave inefficiently on current multicore architectures due to frequency scaling. We investigate this issue through a simple case study: the compilation of the Linux kernel source tree. To do this, we develop SchedLog, a low-overhead scheduler tracing tool, and SchedDisplay, a scriptable tool to graphically analyze SchedLog's traces efficiently. We implement two solutions to the problem at the scheduler level which improve the speed of compiling part of the Linux kernel by up to 26%, and the whole kernel by up to 10%.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.