2009
DOI: 10.1007/s11042-009-0339-z
|View full text |Cite
|
Sign up to set email alerts
|

Building a web-scale image similarity search system

Abstract: As the number of digital images is growing fast and Content-based Image Retrieval (CBIR) is gaining in popularity, CBIR systems should leap towards Web-scale datasets. In this paper, we report on our experience in building an experimental similarity search system on a test collection of more than 50 million images. The first big challenge we have been facing was obtaining a collection of images of this scale with the corresponding descriptive features. We have tackled the non-trivial process of image crawling … Show more

Help me understand this report

Search citation statements

Order By: Relevance

Paper Sections

Select...
1
1
1
1

Citation Types

0
44
0
1

Year Published

2012
2012
2020
2020

Publication Types

Select...
6
1
1

Relationship

2
6

Authors

Journals

citations
Cited by 91 publications
(45 citation statements)
references
References 14 publications
0
44
0
1
Order By: Relevance
“…copying data to Hadoop Distributed File Systems), as it is only optimized for massive batches of queries. Distributed tree-based systems have also been studied for horizontal index partitioning for CBMI Aly et al [2], Batko et al [6], but the e ectiveness of sub-tree based index partitioning is reduced when the dimensionality of the vectors to index increases [36], meaning that more nodes need to be queried. E ective partitioning of the search space is a key part of approximate nearest neighbour algorithms.…”
Section: Related Workmentioning
confidence: 99%
See 2 more Smart Citations
“…copying data to Hadoop Distributed File Systems), as it is only optimized for massive batches of queries. Distributed tree-based systems have also been studied for horizontal index partitioning for CBMI Aly et al [2], Batko et al [6], but the e ectiveness of sub-tree based index partitioning is reduced when the dimensionality of the vectors to index increases [36], meaning that more nodes need to be queried. E ective partitioning of the search space is a key part of approximate nearest neighbour algorithms.…”
Section: Related Workmentioning
confidence: 99%
“…Figure 1 (a) shows a single assignment technique, where each document is assigned to a single partition (e.g. [6]). Figure 1 (b) shows a random assignment technique, where documents are assigned to a single partition randomly, and queries are assigned to all partitions.…”
Section: Space Partitioning Codebooksmentioning
confidence: 99%
See 1 more Smart Citation
“…Luo et al [24] fused information extracted from both a Flickr data set and a set of satellite images, in order to detect events. Batko et al [25] used MPEG-7 visual features and search into a set of over 50M photos from Flickr. Seah et al [26] created visual summaries on the results of visual queries on a data set of Flickr images that in contrast to previous works, e.g., the one of [27], they attempted to generate concept-preserving summaries.…”
Section: Shall We Consider Visual Characteristics?mentioning
confidence: 99%
“…This is typically beyond the capabilities of classic exact match or keyword search techniques and thus the use of various similarity search technologies increases significantly in current applications. A considerable research effort has been invested in this topic resulting in both theoretical background [24] and large-scale practical results [17,3]. …”
mentioning
confidence: 99%