To ensure undisrupted business, large Internet companies need to closely monitor various KPIs (e.g., Page Views, number of online users, and number of orders) of its Web applications, to accurately detect anomalies and trigger timely troubleshooting/mitigation. However, anomaly detection for these seasonal KPIs with various patterns and data quality has been a great challenge, especially without labels. In this paper, we proposed Donut, an unsupervised anomaly detection algorithm based on VAE. Thanks to a few of our key techniques, Donut greatly outperforms a state-of-arts supervised ensemble approach and a baseline VAE approach, and its best F-scores range from 0.75 to 0.9 for the studied KPIs from a top global Internet company. We come up with a novel KDE interpretation of reconstruction for Donut, making it the first VAE-based anomaly detection algorithm with solid theoretical explanation.
Recording runtime status via logs is common for almost every computer system, and detecting anomalies in logs is crucial for timely identifying malfunctions of systems. However, manually detecting anomalies for logs is time-consuming, error-prone, and infeasible. Existing automatic log anomaly detection approaches, using indexes rather than semantics of log templates, tend to cause false alarms. In this work, we propose LogAnomaly, a framework to model unstructured a log stream as a natural language sequence. Empowered by template2vec, a novel, simple yet effective method to extract the semantic information hidden in log templates, LogAnomaly can detect both sequential and quantitive log anomalies simultaneously, which were not done by any previous work. Moreover, LogAnomaly can avoid the false alarms caused by the newly appearing log templates between periodic model retrainings. Our evaluation on two public production log datasets show that LogAnomaly outperforms existing log-based anomaly detection methods.
Abstract-Despite significant efforts to obtain an accurate picture of the Internet's connectivity structure at the level of individual autonomous systems (ASes), much has remained unknown in terms of the quality of the inferred AS maps that have been widely used by the research community. In this paper we assess the quality of the inferred Internet maps through case studies of a sample set of ASes. These case studies allow us to establish the ground truth of connectivity between this set of ASes and their directly connected neighbors. A direct comparison between the ground truth and inferred topology maps yield insights into questions such as which parts of the actual topology are adequately captured by the inferred maps, which parts are missing and why, and what is the percentage of missing links in these parts. This information is critical in assessing, for each class of real-world networking problems, whether the use of currently inferred AS maps or proposed AS topology models is, or is not, appropriate. More importantly, our newly gained insights also point to new directions towards building realistic and economically viable Internet topology maps.
Ranking is a core task in recommender systems, which aims at providing an ordered list of items to users. Typically, a ranking function is learned from the labeled dataset to optimize the global performance, which produces a ranking score for each individual item. However, it may be sub-optimal because the scoring function applies to each item individually and does not explicitly consider the mutual in uence between items, as well as the di erences of users' preferences or intents. erefore, we propose a personalized re-ranking model for recommender systems.e proposed re-ranking model can be easily deployed as a follow-up modular a er any ranking algorithm, by directly using the existing ranking feature vectors. It directly optimizes the whole recommendation list by employing a transformer structure to e ciently encode the information of all items in the list. Speci cally, the Transformer applies a self-a ention mechanism that directly models the global relationships between any pair of items in the whole list. We conrm that the performance can be further improved by introducing pre-trained embedding to learn personalized encoding functions for di erent users. Experimental results on both o ine benchmarks and real-world online e-commerce systems demonstrate the signi cant improvements of the proposed re-ranking model.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.