In recognition that in modern applications billions of images are stored into distributed databases in different logical or physical locations, we propose a similarity search strategy over the cloud based on the dimensions value cardinalities of image descriptors. Our strategy has low preprocessing requirements by dividing the computational cost of the preprocessing steps into several nodes over the cloud and locating the descriptors with similar dimensions value cardinalities logically close. New images are inserted into the distributed databases over the cloud efficiently, by supporting dynamical update in real-time. The proposed insertion algorithm has low computational complexity, depending exclusively on the dimensionality of descriptors and a small subset of descriptors with similar dimensions value cardinalities. Finally, an efficient query processing algorithm is proposed, where the dimensions of image descriptors are prioritized in the searching strategy, assuming that dimensions of high value cardinalities have more discriminative power than the dimensions of low ones. The computation effort of the query processing algorithm is divided into several nodes over the cloud infrastructure. In our experiments with seven publicly available datasets of image descriptors, we show that the proposed similarity search strategy outperforms competitive methods of single node, parallel and cloud-based architectures, in terms of preprocessing cost, search time and accuracy.
ACM Reference Format:Stefanos Antaris and Dimitrios Rafailidis. 2015. Similarity search over the cloud based on image descriptors' dimensions value cardinalities. ACM Trans. Multimedia Comput. Commun.