Time sequences occur in many applications, ranging from science and technology to business and entertainment. In many of these applications, an analysis of time series data, and searching through large, unstructured databases based on sample sequences, is often desirable. Such similarity-based retrieval has attracted a lot of attention in recent years. Although several different approaches have appeared, most are based on the common premise of dimensionality reduction and spatial access methods. This paper gives an overview of recent research and shows how the methods fit into a general context of signature extraction.
The signature quadratic form distance has been introduced as an adaptive similarity measure coping with flexible content representations of multimedia data. While this distance has shown high retrieval quality, its high computational complexity underscores the need for efficient search methods. Recent research has shown that a huge improvement in search efficiency is achieved when using metric indexing. In this paper, we analyze the applicability of Ptolemaic indexing to the signature quadratic form distance. We show that it is a Ptolemaic metric and present an application of Ptolemaic pivot tables to image databases, resolving queries nearly four times as fast as the state-of-the-art metric solution, and up to 300 times as fast as sequential scan.
In XML search systems twig queries specify predicates on node values and on the structural relationships between nodes, and a key operation is to join individual query node matches into full twig matches. Linear time twig join algorithms exist, but many non-optimal algorithms with better average-case performance have been introduced recently. These use somewhat simpler data structures that are faster in practice, but have exponential worst-case time complexity. In this paper we explore and extend the solution space spanned by previous approaches. We introduce new data structures and improved strategies for filtering out useless data nodes, yielding combinations that are both worst-case optimal and faster in practice. An experimental study shows that our best algorithm outperforms previous approaches by an average factor of three on common benchmarks. On queries with at least one unselective leaf node, our algorithm can be an order of magnitude faster, and it is never more than 20% slower on any tested benchmark query.
Summary. This chapter describes several methods of similarity search, based on metric indexing, in terms of their common, underlying principles. Several approaches to creating lower bounds using the metric axioms are discussed, such as pivoting and compact partitioning with metric ball regions and generalized hyperplanes. Finally, pointers are given for further exploration of the subject, including non-metric, approximate, and parallel methods.
Discovering association rules is a well-established problem in the field of data mining, with many existing solutions. In later years, several methods have been proposed for mining rules from sequential and temporal data. This paper presents a novel technique based on genetic programming and specialized pattern matching hardware. The advantages of this method are its flexibility and adaptability, and its ability to produce intelligible rules of considerable complexity.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.