We present a decentralized algorithm for online clustering analysis used for anomaly detection in selfmonitoring distributed systems. In particular, we demonstrate the monitoring of a network of printing devices that can perform the analysis without the use of external computing resources (i.e. in-network analysis). We also show how to ensure the robustness of the algorithm, in terms of anomaly detection accuracy, in the face of failures of the network infrastructure on which the algorithm runs. Further, we evaluate the tradeoff in terms of overhead necessary for ensuring this robustness and present a method to reduce this overhead while maintaining the detection accuracy of the algorithm.
Ensuring the efficient and robust operation of distributed computational infrastructures is critical, given that their scale and overall complexity is growing at an alarming rate and that their management is rapidly exceeding human capability. Clustering analysis can be used to find patterns and trends in system operational data, as well as highlight deviations from these patterns. Such analysis can be essential for verifying the correctness and efficiency of the operation of the system, as well as for discovering specific situations of interest, such as anomalies or faults, that require appropriate management actions.This work analyzes the automated application of clustering for online system management, from the point of view of the suitability of different clustering approaches for the online analysis of system data in a distributed environment, with minimal prior knowledge and within a timeframe that allows the timely interpretation of and response to clustering results. For this purpose, we evaluate DOC (Decentralized Online Clustering), a clustering algorithm designed to support data analysis for autonomic management, and compare it to existing and widely used clustering algorithms. The comparative evaluations will show that DOC achieves a good balance in the trade-offs inherent in the challenges for this type of online management.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.