Significant research has focused on determining efficient methodologies for effective and speedy retrieval in large image databases. Towards that goal, the first contribution of this paper is an image abstraction technique, called variable-bin allocation (VBA), based on signature bitstrings and a corresponding similarity metric. The signature provides a compact representation of an image based on its color content and yields better retrieval effectiveness than when using classical global color histograms (GCHs) and comparable to the one obtained when using color-coherence vectors (CCVs). More importantly however, the use of VBA signatures allows savings of 75% and 87.5% in storage overhead when compared to GCHs and CCVs, respectively. The second contribution is the improvement upon an access structure, the S-tree, exploring the concept of logical and physical pages and a specialized nearest-neighbor type of algorithm, in order to improve retrieval speed. We compared the S-tree performance when indexing the VBA signatures against the SR-tree indexing GCHs and CCVs, since SR-trees are arguably the most efficient access method for high-dimensional points. Our experimental results, using a large number of images and varying several parameters, have shown that the combination VBA/S-tree outperforms the GCH/SR-tree combination in terms of effectiveness, access speed and size (up to 45%, 25% and 70% respectively). Due to the very highdimensionality of the CCVs their indexing, even using an efficient access structure, the SR-tree, did not seem to be a feasible alternative.
We propose two variations of a new image abstraction technique based on signature bit-strings as well as am appropriate similarity metric for color-based image retrieval. Performance evaluation on a heterogeneous database of 20,000 images demonstrated that the proposed technique outperforms well-known approaches while still saving substantial amount of storage space, making it possible to store/search an image database of reasonable size using a few megabytes of main memory (e.g., 4 Mbytes for 100,000 images).
KeywordsColor-based image retrieval, CBIR, image databases, bitstring signatures, color histograms.
scite is a Brooklyn-based organization that helps researchers better discover and understand research articles through Smart Citations–citations that display the context of the citation and describe whether the article provides supporting or contrasting evidence. scite is used by students and researchers from around the world and is funded in part by the National Science Foundation and the National Institute on Drug Abuse of the National Institutes of Health.