Abstract. We consider a problem of elastic matching of time series. We propose an algorithm that automatically determines a subsequence b of a target time series b that best matches a query series a. In the proposed algorithm we map the problem of the best matching subsequence to the problem of a cheapest path in a DAG (directed acyclic graph). Our experimental results demonstrate that the proposed algorithm outperforms the commonly used Dynamic Time Warping in retrieval accuracy.
MotivationFor many datasets we can easily and accurately extract the beginning and ending of patterns of interest. However in some domains it is non-trivial to define the exact beginning and ending of a pattern within a longer sequence. This is a problem because if the endpoints are incorrectly specified they can swamp the distance calculation in otherwise similar objects. For concreteness we will consider an example of just such a domain and show that Minimal Variance Matching (MVM), proposed in this paper, can be expected to outperform Dynamic Time Warping (DTW) and Euclidean distance. There is increasing interest in indexing sports data, both from sports fans who may wish to find particular types of shots or moves, and from coaches who are interested in analyzing their athletes performance over time. Let us consider the high jump. We can automatically collect the athletes center of mass information from video and convert to time series. In Fig. 1, we see 3 time series automatically extracted from 2 athletes.Both sequence A and B are from one individual, a tall male, and C is from a (relatively) short female with a radically different style. The difference in their technique is obvious even to a non-expert, however A and C where automatically segmented in such a way that the bounce from the mat is visible, whereas in B this bounce was truncated. In Fig. 1(middle) we can see that DTW is forced to map this bounce section to the end of sequence B, even though that sequence clearly does not have a truly corresponding section. In contrast MVM is free to ignore the sections that do not have a natural correspondence. It is this difference that enables MVM to produce the more natural clustering shown in
We propose a multistep approach for representing and classifying tree-like structures in medical images. Tree-like structures are frequently encountered in biomedical contexts; examples are the bronchial system, the vascular topology, and the breast ductal network. We use tree encoding techniques, such as the depth-first string encoding and the Prüfer encoding, to obtain a symbolic string representation of the tree's branching topology; the problem of classifying trees is then reduced to string classification. We use the tf-idf text mining technique to assign a weight of significance to each string term (i.e., tree node label). Similarity searches and k-nearest neighbor classification of the trees is performed using the tf-idf weight vectors and the cosine similarity metric. We applied our approach to characterize the ductal tree-like parenchymal structure in X-ray galactograms, in order to distinguish among different radiological findings. Experimental results demonstrate the effectiveness of the proposed approach with classification accuracy reaching up to 86%, and also indicate that our method can potentially aid in providing insight to the relationship between branching patterns and function or pathology.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.