With the growth of streaming data from many domains such as transportation, finance, weather, etc, there has been a surge in interest in time series data mining. With this growth and massive amounts of time series data, time series representation has become essential for reducing dimensionality to overcome the available memory constraints. Moreover, time series data mining processes include similarity search and learning of historical data tasks. These tasks require high computation time, which can be reduced by reducing the data dimensionality. This paper proposes a novel time series representation called Adaptive Simulated Annealing Representation (ASAR). ASAR considers the time series representation as an optimization problem with the objective of preserving the time series shape and reducing the dimensionality. ASAR looks for the instances in the raw time series that can represent the local trends and neglect the rest. The Simulated Annealing optimization algorithm is adapted in this paper to fulfill the objective mentioned above. We compare ASAR to three well-known representation approaches from the literature. The experimental results have shown that ASAR achieved the highest reduction in the dimensions. Moreover, it has been shown that using the ASAR representation, the data mining process is accelerated the most. The ASAR has also been tested in terms of preserving the shape and the information of the time series by performing One Nearest Neighbor (1-NN) classification and K-means clustering, which assures its ability to preserve them by outperforming the competing approaches in the K-means task and achieving close accuracy in the 1-NN classification task.
Traffic congestion detection (TCD) techniques are becoming a critical component of traffic management systems. They can be considered a pre-step to address traffic jam problems, providing useful input for traffic management systems to predict and avoid traffic congestions' unwanted effects. In this research article, two novel real-time congestion detection algorithms are proposed, the TCD algorithm and the ensemble based traffic congestion detection (EB-TCD) algorithm. TCD can detect the congestion based on one traffic feature (i.e., speed, occupancy, or flow). TCD first cleans and preprocesses the traffic data. After, it computes the absolute of the first derivative of each sample in the traffic feature to determine its anomaly score. Then, the anomaly likelihood of each sample is computed to classify it as an anomaly or normal sample. On the other hand, EB-TCD utilizes the information contained in multiple traffic features in parallel by proposing an ensemble technique to combine the anomaly scores coming out from the different traffic features in one unified score. EB-TCD computes an anomaly score based on each traffic feature separately. Then, it votes for each sample in each traffic feature as an anomaly or not (from the corresponding traffic feature's point of view). After, the weight for each traffic feature is determined by studying the dissimilarity between their anomaly scores. These weights show how much each traffic feature explains the traffic behavior. Finally, the results are combined to form a unified anomaly score for each sample. The proposed statistical-based techniques are able to detect changes in traffic patterns while mitigating the effect of noise in traffic data. Moreover, the proposed algorithms' parameters were tuned and tailored for fast detection, which is a crucial feature for traffic management systems since fast detection accelerates decision-making. The computational complexity analysis has shown the simplicity of the proposed algorithms. Furthermore, the evaluation results have shown that the two algorithms outperform other widespread methods from the literature in terms of false alarm rate and detection time while keeping the detection rate at the same high level.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.