2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) 2021
DOI: 10.1109/icse-seip52600.2021.00043
|View full text |Cite
|
Sign up to set email alerts
|

MicroHECL: High-Efficient Root Cause Localization in Large-Scale Microservice Systems

Abstract: Availability issues of industrial microservice systems (e.g., drop of successfully placed orders and processed transactions) directly affect the running of the business. These issues are usually caused by various types of service anomalies which propagate along service dependencies. Accurate and high-efficient root cause localization is thus a critical challenge for large-scale industrial microservice systems. Existing approaches use service dependency graph based analysis techniques to automatically locate ro… Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
5

Citation Types

0
32
0

Year Published

2021
2021
2023
2023

Publication Types

Select...
4
2
1

Relationship

0
7

Authors

Journals

citations
Cited by 95 publications
(32 citation statements)
references
References 18 publications
0
32
0
Order By: Relevance
“…Recently there have been works on rootcause analysis or anomaly localization in Micro-service based systems, e.g. [8], [9] where the goal is to find the failing microservice among the services.…”
Section: Introductionmentioning
confidence: 99%
“…Recently there have been works on rootcause analysis or anomaly localization in Micro-service based systems, e.g. [8], [9] where the goal is to find the failing microservice among the services.…”
Section: Introductionmentioning
confidence: 99%
“…Multivariate time series serves as an important reference of the "health status" of many Internet applications such cloud computing, micro-service systems and the Internet routing network, to name a few. For example, consider the cloud computing system, which has become a crucial infrastructure for many customized services of companies and governments [22,26,19]. To maintain a high quality of service of the cloud computing system, it is important to monitor the "health status" of the cloud system and perform troubleshooting in real-time [19,22,26].…”
Section: Introductionmentioning
confidence: 99%
“…For example, consider the cloud computing system, which has become a crucial infrastructure for many customized services of companies and governments [22,26,19]. To maintain a high quality of service of the cloud computing system, it is important to monitor the "health status" of the cloud system and perform troubleshooting in real-time [19,22,26]. To achieve this, many performance metrics of a cloud computing system such as CPU usage, disk I/O rate, network packet loss rate, etc., are monitored in real-time, which form a MTS [19,22,26].…”
Section: Introductionmentioning
confidence: 99%
See 2 more Smart Citations