Real-time anomaly detection of massive data streams is an important research topic nowadays due to the fact that a lot of data is generated in continuous temporal processes. There is a broad research area, covering mathematical, statistical, information theory methodologies for anomaly detection. It addresses various problems in a lot of domains such as health, education, finance, government, etc. In this paper, we analyze the state-of-the-art of data streams anomaly detection techniques and algorithms for anomaly detection in data streams (time series data). Critically surveying the techniques' performances under the challenge of real-time anomaly detection of massive high-velocity streams, we conclude that the modeling of the normal behavior of the stream is a suitable approach. We evaluate Holt-Winters (HW), Taylor's Double Holt-Winters (TDHW), Hierarchical temporal memory (HTM), Moving Average (MA), Autoregressive integrated moving average (ARIMA) forecasting models, etc. Holt-Winters (HW) and Taylor's Double Holt-Winters (TDHW) forecasting models are used to predict the normal behavior of the periodic streams, and to detect anomalies when the deviations of observed and predicted values exceeded some predefined measures. In this work, we propose an enhancement of this approach and give a short description about the algorithms and then they are categorized by type of prediction as: predictive and non-predictive algorithms. We implement the Genetic Algorithm (GA) to periodically optimize HW and TDHW smoothing parameters in addition to the two sliding windows parameters that improve Hyndman's MASE measure of deviation, and value of the threshold parameter that defines no anomaly confidence interval [1]. We also propose a new optimization function based on the input training datasets with the annotated anomaly intervals, in order to detect the right anomalies and minimize the number of false ones. The proposed method is evaluated on the known anomaly detection benchmarks NUMENTA and Yahoo datasets with annotated anomalies and real log data generated by the National education information system
This study presents a practical view of dynamic programming, specifically in the context of the application of finding optimal solutions for the polygon triangulation problem. The problem of the optimal triangulation of a polygon is considered to be a recursive substructure. The basic idea of the constructed method lies in finding an adequate method for the rapid generation of optimal triangulations and storing them in as small a memory space as possible. Our method is based on a memoization technique, and its emphasis is in storing the results of the calculated values and returning the cached result when the same values occur again. The significance of the method is in the generation of the optimal triangulation for a large number of n. All of the calculated weights in the triangulation process are stored and performed in the same table. The processing of the results and implementation of the method were carried out in the Java environment, and the experimental results were compared with the square matrix and Hurtado-Noy method.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.