Microscaler: Automatic Scaling for Microservices with an Online Learning Approach

Yu, Guangba; Chen, Pengfei; Zheng, Zibin

doi:10.1109/icws.2019.00023

Cited by 78 publications

(41 citation statements)

References 10 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…The authors of this study focused on microservices scheduling to minimize the end-to-end response delay under a pre-specified budget constraint. Another effort by [15] used response latency as a driving factor to trigger autoscaling cloud microservices. Moreover, [16] focused to investigate the impact of the heap size, garbage collection, concurrency and service demand on the tail latency of Java microservices.…”

Section: End-to-end Latency Predictionmentioning

confidence: 99%

End-to-End Latency Prediction of Microservices Workflow on Kubernetes: A Comparative Evaluation of Machine Learning Models and Resource Metrics

Mohamed¹,

El‐Gayar²

2021

Proceedings of the Annual Hawaii International Conference on System Sciences

View full text Add to dashboard Cite

Application design has been revolutionized with the adoption of microservices architecture. The ability to estimate end-to-end response latency would help software practitioners to design and operate microservices applications reliably and with efficient resource capacity. The objective of this research is to examine and compare data-driven approaches and a variety of resource metrics to predict end-to-end response latency of a containerized microservices workflow running in a cloud Kubernetes platform. We implemented and evaluated the prediction using a deep neural network and various machine learning techniques while investigating the selection of resource utilization metrics. Observed characteristics and performance metrics from both microservices and platform levels were used as prediction indicators. To compare performance models, we experimented with a benchmarking open-source Sock Shop containerized application. A deep neural network technique exhibited the best prediction accuracy using all metrics, while other machine learning techniques demonstrated acceptable performance using a subset of the metrics.

show abstract

Section: End-to-end Latency Predictionmentioning

confidence: 99%

End-to-End Latency Prediction of Microservices Workflow on Kubernetes: A Comparative Evaluation of Machine Learning Models and Resource Metrics

Mohamed¹,

El‐Gayar²

2021

Proceedings of the Annual Hawaii International Conference on System Sciences

View full text Add to dashboard Cite

show abstract

“…Since the offline supervised learning algorithm cannot output path indexes in real time, it is necessary to introduce online machine learning algorithms (for example, reinforcement learning) to adaptively adjust the model according to any major events. Secondly, we may also need to apply machine learning methods to dynamically build resource models under different workloads, that is, upgrade the mesh components studied to Istio-like service grid [37]. Furthermore, the throughput performance of the transport network is constrained by the request rate of the service, the stream access control strategy, the mesh network topology, and the machine learning models.…”

Section: ) Comparison Of the Number Of Outages With And Without Retransmissionmentioning

confidence: 99%

Intelligent Routing Orchestration for Ultra-Low Latency Transport Networks

et al. 2020

View full text Add to dashboard Cite

Autonomous driving scenarios face the need for millisecond real-time response, which has led to the study of mobile networks with high speed and ultra-low latency. Software-defined networking (SDN) is recognized as a key technology for next-generation networks because it contains advanced functions such as centralized control, software-based traffic analysis, and forwarding rules for dynamic updates. In this paper, an SDN with flexible architecture is considered and a transport component is proposed. The component based on mesh topology is an example of joint route prediction and forwarding. First, different from existing transport protocols, the component can adopt a software-defined stream access control strategy that includes an extended forwarding mechanism (retransmission) to improve the shortterm response performance. Second, we evaluate the impact of route prediction on transport network performance by using offline training and prediction. The key challenge here is that a suitable model needs to be trained from a limited training sample dataset, which will dynamically update the forwarding rules based on current and historical facts (network data). By introducing a parallel neural network classifier, an intelligent route arrangement is implemented in this work. Experimental results over different traffic patterns verify the advantages of the design. Not only does it enhance the flexibility of SDN, but it also significantly reduces the signaling overhead of the transport network without reducing the network throughput.INDEX TERMS SDN, traffic control, routing, machine learning, prediction.

show abstract

“…To set up the auto scaling thresholds, it normally involves initial guess work and couple of rounds of fine tuning based on the real time experiences in production environments. During this guessing process, it might lead to situations where either the autoscaling thresholds are under configured there by leading to scenarios where the service becomes unavailable to the callers during a peak load situation (or) extra compute resources are provisioned which might go unused, due to the over configured auto-scaling threshold values, causing monetary losses [6] [7]. Some of the researchers like Muhammad Abdullah and others [24] have solved this problem for the micro services which are already functional in the production environments by using a resource prediction model which is trained based on the historical autoscaling performance traces.…”

Section: Related Workmentioning

confidence: 99%

“…That makes the value of m, n in equation 7as 10. Using these values, the resource utilization at any point of time T can be calculated as per equation (7). To calculate the total resource utilization, we used equation (9) based on Reiman sum, by splitting the time intervals as 10,20,30,40,50 and 60 seconds.…”

Section: Figure 6 Cpu Utilization Trend For Large Functionmentioning

confidence: 99%

A Quantitative Approach for Estimating the Scaling Thresholds and Step Policies in a Distributed Microservice Architecture

Rudrabhatla¹

2020

IEEE Access

View full text Add to dashboard Cite

Microservice architecture (MSA) has become a de facto standard for developing complex web applications lately. Horizontal scalability, domain isolation, agility and the provision to use heterogenous technologies are some of the key factors for the growing popularity of this architecture. To automatically cater to varying load patterns, quite a lot of advancements have been made in the field of cloud computing, containerization and orchestrating mechanisms which aid to perform the auto scaling of the microservices. However, setting up the scaling policies, optimal upper and lower thresholds is a daunting task for large applications. It generally involves some initial guess work followed by multiple rounds of tuning based on the real time load variations. This process causes situations where either the service becomes unavailable to the load when the thresholds are on the lower side, (or) underutilization of the compute resources when they are on higher side. This paper aims to find a quantitative way of determining the thresholds and step-up policies by deducing the mathematical formulas. To solve this formidable problem, we propose a model in which the total resource consumption of a container running in the peak load scenario can be calculated by-(1) first identifying the critical transactions and their maximum concurrency rates,(2) then calculating the resource consumption of such transactions in a controlled environment and (3) finally applying those values to the mathematical formulas based on Gaussian functions to calculate the total resource consumption for the peak load scenario. Using the total resource consumption value and considering the network and startup latencies, an optimal upper threshold value for step-up functions can be calculated. In this paper, we calculated the upper threshold values using the above-mentioned approach and verified using a research project that the calculated value is indeed the minimum number of containers to handle load.

show abstract

Microscaler: Automatic Scaling for Microservices with an Online Learning Approach

Cited by 78 publications

References 10 publications

End-to-End Latency Prediction of Microservices Workflow on Kubernetes: A Comparative Evaluation of Machine Learning Models and Resource Metrics

End-to-End Latency Prediction of Microservices Workflow on Kubernetes: A Comparative Evaluation of Machine Learning Models and Resource Metrics

Intelligent Routing Orchestration for Ultra-Low Latency Transport Networks

A Quantitative Approach for Estimating the Scaling Thresholds and Step Policies in a Distributed Microservice Architecture

Contact Info

Product

Resources

About