Signature-based tree structures which have been proposed in the past do not perform well for large databases. The problem arises from the fact that they are incapable of pruning searching, especially at the upper tree levels, and thus they have decreased selectivities. In this paper, we locate a number of reasons for this problem and propose several methods for node splitting and partial-tree restructuring, which lead to improved query-response times. We have implemented all methods and we present experimental results, which indicate that the proposed methods are superior in all cases to the standard one and up to 5-10 times better for medium and higher weights in inclusive (partialmatch) queries. Additionally, we have developed new functions for the performance estimation of signature trees which, in contrast to a previous estimation function, are able to take into account the outcome of different split methods and to provide more accurate estimation.
Aiming at the efficient retrieval of objects with set-valued attributes, we introduce three variations of a new method in order to satisfy subset and superset queries. Our approach is to combine the advantages of two access methods, that of linear Hashing and of tree-shaped methods, on which other similar methods have been previously reported as well. Performance estimation analytical functions for each particular method are presented, followed by a thorough experimental comparison of all investigated structures, where analytical and experimental results deviate 10% on the average. Finally, the results of this performance evaluation are presented and discussed, clearly showing the superiority of the new methods reaching an improvement of up to 85%. r
Significant research has focused on determining efficient methodologies for effective and speedy retrieval in large image databases. Towards that goal, the first contribution of this paper is an image abstraction technique, called variable-bin allocation (VBA), based on signature bitstrings and a corresponding similarity metric. The signature provides a compact representation of an image based on its color content and yields better retrieval effectiveness than when using classical global color histograms (GCHs) and comparable to the one obtained when using color-coherence vectors (CCVs). More importantly however, the use of VBA signatures allows savings of 75% and 87.5% in storage overhead when compared to GCHs and CCVs, respectively. The second contribution is the improvement upon an access structure, the S-tree, exploring the concept of logical and physical pages and a specialized nearest-neighbor type of algorithm, in order to improve retrieval speed. We compared the S-tree performance when indexing the VBA signatures against the SR-tree indexing GCHs and CCVs, since SR-trees are arguably the most efficient access method for high-dimensional points. Our experimental results, using a large number of images and varying several parameters, have shown that the combination VBA/S-tree outperforms the GCH/SR-tree combination in terms of effectiveness, access speed and size (up to 45%, 25% and 70% respectively). Due to the very highdimensionality of the CCVs their indexing, even using an efficient access structure, the SR-tree, did not seem to be a feasible alternative.
The S-tree is a dynamic height-balanced tree similar in structure to B+trees. S-trees store fixed length bit-strings, which are called signatures. Signatures are used for indexing textbases, relational, object oriented and extensible databases as well as in data mining. In this article, methods of designing multi-disk B-trees are adapted to S-trees and new methods of parallelizing S-trees are developed. The resulting structures aim at achieving performance gain by accessing two or more disks simultaneously. In addition, two different searching techniques that exploit parallel disk accessing are devised. Performance results of experiments based on the new structures and searching techniques are also presented and discussed.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.
customersupport@researchsolutions.com
10624 S. Eastern Ave., Ste. A-614
Henderson, NV 89052, USA
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Copyright © 2024 scite LLC. All rights reserved.
Made with 💙 for researchers
Part of the Research Solutions Family.