Abstract-Good performance and efficiency, in terms of high quality of service and resource utilization for example, are important goals in a cloud environment. Through extensive measurements of an n-tier application benchmark (RUBBoS), we show that overall system performance is surprisingly sensitive to appropriate allocation of soft resources (e.g., server thread pool size). Inappropriate soft resource allocation can quickly degrade overall application performance significantly. Concretely, both under-allocation and over-allocation of thread pool can lead to bottlenecks in other resources because of non-trivial dependencies. We have observed some non-obvious phenomena due to these correlated bottlenecks. For instance, the number of threads in the Apache web server can limit the total useful throughput, causing the CPU utilization of the C-JDBC clustering middleware to decrease as the workload increases. We provide a practical iterative solution approach to this challenge through an algorithmic combination of operational queuing laws and measurement data. Our results show that soft resource allocation plays a central role in the performance scalability of complex systems such as n-tier applications in cloud environments.
Identifying the location of performance bottlenecks is a non-trivial challenge when scaling n-tier applications in computing clouds. Specifically, we observed that an n-tier application may experience significant performance loss when there are transient bottlenecks in component servers. Such transient bottlenecks arise frequently at high resource utilization and often result from transient events (e.g., JVM garbage collection) in an n-tier system and bursty workloads. Because of their short lifespan (e.g., milliseconds), these transient bottlenecks are difficult to detect using current system monitoring tools with sampling at intervals of seconds or minutes. We describe a novel transient bottleneck detection method that correlates throughput (i.e., request service rate) and load (i.e., number of concurrent requests) of each server in an n-tier system at fine time granularity. Both throughput and load can be measured through passive network tracing at millisecond-level time granularity. Using correlation analysis, we can identify the transient bottlenecks at time granularities as short as 50ms. We validate our method experimentally through two case studies on transient bottlenecks caused by factors at the system software layer (e.g., JVM garbage collection) and architecture layer (e.g., Intel SpeedStep).
Dynamic Voltage and Frequency Scaling (DVFS) has been widely deployed and proven to reduce energy consumption at low CPU utilization levels; however, our measurements of the n-tier application benchmark (RUBBoS) performance showed significant performance degradation at high utilization levels, with response time several times higher and throughput loss of up to 20%, when DVFS is turned on. Using a combination of benchmark measurements and simulation, we found two kinds of problems: large response time fluctuations due to push-back wave queuing in n-tier systems and throughput loss due to rapidly alternating bottlenecks. These problems arise from anti-synchrony between DVFS adjustment period and workload burst cycles (similar cycle length but out of phase). Simulation results (confirmed by extensive measurements) show the anti-synchrony happens routinely for a wide range of configurations. We show that a workload-sensitive DVFS adaptive control mechanism can disrupt the antisynchrony and reduce the performance impact of DVFS at high utilization levels to 25% or less of the original.
Abstract-A central goal of cloud computing is high resource utilization through hardware sharing; however, utilization often remains modest in practice due to the challenges in predicting consolidated application performance accurately. We present a thorough experimental study of consolidated n-tier application performance at high utilization to address this issue through reproducible measurements. Our experimental method illustrates opportunities for increasing operational efficiency by making consolidated application performance more predictable in high utilization scenarios. The main focus of this paper are non-trivial dependencies between SLA-critical response time degradation effects and software configurations (i.e., readily available tuning knobs). Methodologically, we directly measure and analyze the resource utilizations, request rates, and performance of two consolidated n-tier application benchmark systems (RUBBoS) in an enterprise-level computer virtualization environment. We find that monotonically increasing the workload of an n-tier application system may unexpectedly spike the overall response time of another co-located system by 300 percent despite stable throughput. Based on these findings, we derive a software configuration best-practice to mitigate such non-monotonic response time variations by enabling higher request-processing concurrency (e.g., more threads) in all tiers. More generally, this experimental study increases our quantitative understanding of the challenges and opportunities in the widely used (but seldom supported, quantified, or even mentioned) hypothesis that applications consolidate with linear performance in cloud environments.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.