Abstract-Dynamic time warping (DTW) is a popular distance measure used for recognition free document image retrieval. However, it has quadratic complexity and hence is computationally expensive for large scale word image retrieval. In this paper, we use a fast approximation to the DTW distance, which makes word retrieval efficient. For a pair of sequences, to compute their DTW distance, we need to find the optimal alignment from all the possible alignments. This is a computationally expensive operation. In this work, we learn a small set of global principal alignments from the training data and avoid the computation of alignments for query images. Thus, our proposed approximation is significantly faster compared to DTW distance, and gives 40 times speed up. We approximate the DTW distance as a sum of multiple weighted Eulidean distances which are known to be amenable to indexing and efficient retrieval. We show the speed up of proposed approximation on George Washington collection and multi-language datasets containing words from English and two Indian languages.
Abstract. In this paper, we propose the canonical correlation kernel (CCK), that seamlessly integrates the advantages of lower dimensional representation of videos with a discriminative classifier like SVM. In the process of defining the kernel, we learn a low-dimensional (linear as well as nonlinear) representation of the video data, which is originally represented as a tensor. We densely compute features at single (or two) frame level, and avoid any explicit tracking. Tensor representation provides the holistic view of the video data, which is the starting point of computing the CCK. Our kernel is defined in terms of the principal angles between the lower dimensional representations of the tensor, and captures the similarity of two videos in an efficient manner. We test our approach on four public data sets and demonstrate consistent superior results over the state of the art methods, including those that use canonical correlations.
The dynamic time warping (DTW) distance is a popular similarity measure for comparing time series data. It has been successfully applied in many fields like speech recognition, data mining and information retrieval to automatically cope with time deformations and variations in the length of the time dependent data. There have been attempts in the past to define kernels on DTW distance. These kernels try to approximate the DTW distance. However, these have quadratic complexity and these are computationally expensive for large time series. In this paper, we introduce FastDTW kernel, which is a linear approximation of the DTW kernel and can be used with linear SVM.To compute the DTW distance for any given sequences, we need to find the optimal warping path from all the possible alignments, which is a computationally expensive operation. Instead of finding the optimal warping path for every pair of sequences, we learn a small set of global alignments from a given dataset and use these alignments for comparing the given sequences. In this work, we learn the principal global alignments for the given data by using the hidden structure of the alignments from the training data. Since we use only a small number of global alignments for comparing the given test sequences, our proposed approximation kernel is computationally efficient compared to previous kernels on DTW distance. Further, we also propose a approximate explicit featuremap for our proposed kernel. Our results show the efficiency of the proposed approximation kernel.
Abstract:In this paper, we improve the performance of the recently proposed Direct Query Classifier (DQC). The (DQC) is a classifier based retrieval method and in general, such methods have been shown to be superior to the OCR-based solutions for performing retrieval in many practical document image datasets. In (DQC), the classifiers are trained for a set of frequent queries and seamlessly extended for the rare and arbitrary queries. This extends the classifier based retrieval paradigm to an unlimited number of classes (words) present in a language. The (DQC) requires indexing cut-portions (n-grams) of the word image and DTW distance has been used for indexing. However, DTW is computationally slow and therefore limits the performance of the (DQC). We introduce query specific DTW distance, which enables effective computation of global principal alignments for novel queries. Since the proposed query specific DTW distance is a linear approximation of the DTW distance, it enhances the performance of the (DQC). Unlike previous approaches, the proposed query specific DTW distance uses both the class mean vectors and the query information for computing the global principal alignments for the query. Since the proposed method computes the global principal alignments using n-grams, it works well for both frequent and rare queries. We also use query expansion (QE) to further improve the performance of our query specific DTW. This also allows us to seamlessly adapt our solution to new fonts, styles and collections. We have demonstrated the utility of the proposed technique over 3 different datasets. The proposed query specific DTW performs well compared to the previous DTW approximations.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.