Optimizing dynamic time warping’s window width for time series data mining applications

Dau, Son Hoang; Silva, Diego Furtado; Petitjean, François; Forestier, Germain; Bagnall, Anthony; Mueen, Abdullah; Keogh, Eamonn

doi:10.1007/s10618-018-0565-y

Cited by 73 publications

(26 citation statements)

References 57 publications

Supporting

Mentioning

Contrasting

Order By: Relevance

“…Here, we used biological insights from a previous study to set ε at a spatial scale reflecting that at which GPS-tracked gannets typically forage, but one could set ε to reflect known location error from ones tracking device for example. For DTW, LCSS and EDR one must also set a δ value and while many studies adopt an unconstrained approach as we do here adjusting this parameter can sometimes improve clustering performance (Dau et al 2018). We also note that the similarity measures covered here represent only a subset of available trajectory similarity measures (Ranacher and Tzavella 2014) and that rather than having to choose between similarity measures it may be possible to use them as an ensemble for machine learning methods of time-series classification purposes (Lines and Bagnall 2015).…”

Section: Discussionmentioning

confidence: 99%

Using time-series similarity measures to compare animal movement trajectories in ecology

Cleasby

Wakefield

Morrissey

et al. 2019

Behav Ecol Sociobiol

View full text Add to dashboard Cite

Identifying and understanding patterns in movement data are amongst the principal aims of movement ecology. By quantifying the similarity of movement trajectories, inferences can be made about diverse processes, ranging from individual specialisation to the ontogeny of foraging strategies. Movement analysis is not unique to ecology however, and methods for estimating the similarity of movement trajectories have been developed in other fields but are currently under-utilised by ecologists. Here, we introduce five commonly used measures of trajectory similarity: dynamic time warping (DTW), longest common subsequence (LCSS), edit distance for real sequences (EDR), Fréchet distance and nearest neighbour distance (NND), of which only NND is routinely used by ecologists. We investigate the performance of each of these measures by simulating movement trajectories using an Ornstein-Uhlenbeck (OU) model in which we varied the following parameters: (1) the point of attraction, (2) the strength of attraction to this point and (3) the noise or volatility added to the movement process in order to determine which measures were most responsive to such changes. In addition, we demonstrate how these measures can be applied using movement trajectories of breeding northern gannets (Morus bassanus) by performing trajectory clustering on a large ecological dataset. Simulations showed that DTW and Fréchet distance were most responsive to changes in movement parameters and were able to distinguish between all the different parameter combinations we trialled. In contrast, NND was the least sensitive measure trialled. When applied to our gannet dataset, the five similarity measures were highly correlated despite differences in their underlying calculation. Clustering of trajectories within and across individuals allowed us to easily visualise and compare patterns of space use over time across a large dataset. Trajectory clusters reflected the bearing on which birds departed the colony and highlighted the use of well-known bathymetric features. As both the volume of movement data and the need to quantify similarity amongst animal trajectories grow, the measures described here and the bridge they provide to other fields of research will become increasingly useful in ecology. Significance statement As the use of tracking technology increases, there is a need to develop analytical techniques to process such large volumes of data. One area in which this would be useful is the comparison of individual movement trajectories. In response, a variety of measures Communicated by L. Z. Garamszegi

show abstract

Section: Discussionmentioning

confidence: 99%

Using time-series similarity measures to compare animal movement trajectories in ecology

Cleasby

Wakefield

Morrissey

et al. 2019

Behav Ecol Sociobiol

View full text Add to dashboard Cite

show abstract

“…Also in this case we perform zero padding to fit all the time-series lengths to the size of the longest one. -The Dynamic Time Warping measures [7] (DTW ) coupled with K-means algorithm. Such distance measure is especially tailored for time-series data with variable length-size.…”

Section: Competitors and Methods Ablationsmentioning

confidence: 99%

Deep Multivariate Time Series Embedding Clustering via Attentive-Gated Autoencoder

Ienco¹,

Interdonato²

2020

Lecture Notes in Computer Science

View full text Add to dashboard Cite

Nowadays, great quantities of data are produced by a large and diverse family of sensors (e.g., remote sensors, biochemical sensors, wearable devices), which typically measure multiple variables over time, resulting in data streams that can be profitably organized as multivariate time-series. In practical scenarios, the speed at which such information is collected often makes the data labeling task uneasy and too expensive, so that limit the use of supervised approaches. For this reason, unsupervised and exploratory methods represent a fundamental tool to deal with the analysis of multivariate time series. In this paper we propose a deep-learning based framework for clustering multivariate time series data with varying lengths. Our framework, namely DeTSEC (Deep Time Series Embedding Clustering), includes two stages: firstly a recurrent autoencoder exploits attention and gating mechanisms to produce a preliminary embedding representation; then, a clustering refinement stage is introduced to stretch the embedding manifold towards the corresponding clusters. Experimental assessment on six real-world benchmarks coming from different domains has highlighted the effectiveness of our proposal.

show abstract

“…The bandwidth r defines the constraint range of the matching path in the distance matrix and suppresses the influence of undesired convergence in the matching path [52]. Because there was a correlation between the defined warping offset distance and the SDTW algorithm, as well as the SDTW-based distance and the constraint bandwidth r, different r not only affected the optimal matching path of the SDTW but also led to the change of d similarity .…”

Section: Similarity Measure Evaluation With One Nearest Neighbor (1-nmentioning

confidence: 99%

“…For case II, it can be considered that the constraint bandwidth did not affect the distance measured by the SDTW algorithm, and the first r corresponding to the minimum can be seen as the candidate. For the situation in case III that multiple candidate values within the convergence region corresponded to the same minimum value E SUM , the median of these candidate values was selected as r. Here, the general rules for determining and adjusting the preset range for r can refer to [52].…”

Section: Similarity Measure Evaluation With One Nearest Neighbor (1-nmentioning

confidence: 99%

Combining SDAE Network with Improved DTW Algorithm for Similarity Measure of Ultra-Weak FBG Vibration Responses in Underground Structures

Zuo

et al. 2020

Sensors

View full text Add to dashboard Cite

Quantifying structural status and locating structural anomalies are critical to tracking and safeguarding the safety of long-distance underground structures. Given the dynamic and distributed monitoring capabilities of an ultra-weak fiber Bragg grating (FBG) array, this paper proposes a method combining the stacked denoising autoencoder (SDAE) network and the improved dynamic time wrapping (DTW) algorithm to quantify the similarity of vibration responses. To obtain the dimensionality reduction features that were conducive to distance measurement, the silhouette coefficient was adopted to evaluate the training efficacy of the SDAE network under different hyperparameter settings. To measure the distance based on the improved DTW algorithm, the one nearest neighbor (1-NN) classifier was utilized to search the best constraint bandwidth. Moreover, the study proposed that the performance of different distance metrics used to quantify similarity can be evaluated through the 1-NN classifier. Based on two one-dimensional time-series datasets from the University of California, Riverside (UCR) archives, the detailed implementation process for similarity measure was illustrated. In terms of feature extraction and distance measure of UCR datasets, the proposed integrated approach of similarity measure showed improved performance over other existing algorithms. Finally, the field-vibration responses of the track bed in the subway detected by the ultra-weak FBG array were collected to determine the similarity characteristics of structural vibration among different monitoring zones. The quantitative results indicated that the proposed method can effectively quantify and distinguish the vibration similarity related to the physical location of structures.

show abstract

Optimizing dynamic time warping’s window width for time series data mining applications

Cited by 73 publications

References 57 publications

Using time-series similarity measures to compare animal movement trajectories in ecology

Using time-series similarity measures to compare animal movement trajectories in ecology

Deep Multivariate Time Series Embedding Clustering via Attentive-Gated Autoencoder

Combining SDAE Network with Improved DTW Algorithm for Similarity Measure of Ultra-Weak FBG Vibration Responses in Underground Structures

Contact Info

Product

Resources

About